Conway–Bromage–Lyndon (CBL): an exact, ...
Document type :
Compte-rendu et recension critique d'ouvrage
Permalink :
Title :
Conway–Bromage–Lyndon (CBL): an exact, dynamic representation of k -mer sets
Author(s) :
Martayan, Igor [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Cazaux, Bastien [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Limasset, Antoine [Auteur]
Centre National de la Recherche Scientifique [CNRS]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Marchet, Camille [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Centre National de la Recherche Scientifique [CNRS]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Cazaux, Bastien [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Limasset, Antoine [Auteur]
Centre National de la Recherche Scientifique [CNRS]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Marchet, Camille [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Centre National de la Recherche Scientifique [CNRS]
Journal title :
Bioinformatics
Pages :
i48-i57
Publisher :
Oxford University Press (OUP)
Publication date :
2024-07-28
ISSN :
1367-4803
HAL domain(s) :
Informatique [cs]
Informatique [cs]/Bio-informatique [q-bio.QM]
Informatique [cs]/Bio-informatique [q-bio.QM]
English abstract : [en]
Abstract Summary In this article, we introduce the Conway–Bromage–Lyndon (CBL) structure, a compressed, dynamic and exact method for representing k-mer sets. Originating from Conway and Bromage’s concept, CBL innovatively ...
Show more >Abstract Summary In this article, we introduce the Conway–Bromage–Lyndon (CBL) structure, a compressed, dynamic and exact method for representing k-mer sets. Originating from Conway and Bromage’s concept, CBL innovatively employs the smallest cyclic rotations of k-mers, akin to Lyndon words, to leverage lexicographic redundancies. In order to support dynamic operations and set operations, we propose a dynamic bit vector structure that draws a parallel with Elias-Fano’s scheme. This structure is encapsulated in a Rust library, demonstrating a balanced blend of construction efficiency, cache locality, and compression. Our findings suggest that CBL outperforms existing dynamic k-mer set methods. Unique to this work, CBL stands out as the only known exact k-mer structure offering in-place set operations. Its different combined abilities position it as a flexible Swiss knife structure for k-mer set management. Availability and implementation https://github.com/imartayan/CBL.Show less >
Show more >Abstract Summary In this article, we introduce the Conway–Bromage–Lyndon (CBL) structure, a compressed, dynamic and exact method for representing k-mer sets. Originating from Conway and Bromage’s concept, CBL innovatively employs the smallest cyclic rotations of k-mers, akin to Lyndon words, to leverage lexicographic redundancies. In order to support dynamic operations and set operations, we propose a dynamic bit vector structure that draws a parallel with Elias-Fano’s scheme. This structure is encapsulated in a Rust library, demonstrating a balanced blend of construction efficiency, cache locality, and compression. Our findings suggest that CBL outperforms existing dynamic k-mer set methods. Unique to this work, CBL stands out as the only known exact k-mer structure offering in-place set operations. Its different combined abilities position it as a flexible Swiss knife structure for k-mer set management. Availability and implementation https://github.com/imartayan/CBL.Show less >
Language :
Anglais
Popular science :
Non
ANR Project :
Collections :
Source :
Submission date :
2024-11-05T03:05:32Z
Files
- document
- Open access
- Access the document
- 2024.01.29.577700v2.full.pdf
- Open access
- Access the document