Advancements in colored k-mer sets: ...
Document type :
Pré-publication ou Document de travail
Permalink :
Title :
Advancements in colored k-mer sets: essentials for the curious
Author(s) :
Marchet, Camille [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
English keyword(s) :
k-mer
colored de Bruijn graph
SBWT
hash table
MPHF
trie
Bloom filter
colored de Bruijn graph
SBWT
hash table
MPHF
trie
Bloom filter
HAL domain(s) :
Informatique [cs]
Sciences du Vivant [q-bio]
Sciences du Vivant [q-bio]
English abstract : [en]
This paper provides a comprehensive review of recent advancements in k-mer-based data structures representing collections of several samples (sometimes called colored de Bruijn graphs) and their applications in large-scale ...
Show more >This paper provides a comprehensive review of recent advancements in k-mer-based data structures representing collections of several samples (sometimes called colored de Bruijn graphs) and their applications in large-scale sequence indexing and pangenomics. The review explores the evolution of k-mer set representations, highlighting the trade-offs between exact and inexact methods, as well as the integration of compression strategies and modular implementations. I discuss the impact of these structures on practical applications and describe recent utilization of these methods for analysis. By surveying the state-of-the-art techniques and identifying emerging trends, this work aims to guide researchers in selecting and developing methods for large scale and reference-free genomic data. For a broader overview of k-mer set representations and foundational data structures, see the accompanying article on practical k-mer setsShow less >
Show more >This paper provides a comprehensive review of recent advancements in k-mer-based data structures representing collections of several samples (sometimes called colored de Bruijn graphs) and their applications in large-scale sequence indexing and pangenomics. The review explores the evolution of k-mer set representations, highlighting the trade-offs between exact and inexact methods, as well as the integration of compression strategies and modular implementations. I discuss the impact of these structures on practical applications and describe recent utilization of these methods for analysis. By surveying the state-of-the-art techniques and identifying emerging trends, this work aims to guide researchers in selecting and developing methods for large scale and reference-free genomic data. For a broader overview of k-mer set representations and foundational data structures, see the accompanying article on practical k-mer setsShow less >
Language :
Anglais
Collections :
Source :
Submission date :
2024-09-11T02:03:02Z
Files
- document
- Open access
- Access the document
- Advancements_in_colored_k_mer_sets-2.pdf
- Open access
- Access the document