Alignment-free detection and seed-based ...
Type de document :
Pré-publication ou Document de travail
Titre :
Alignment-free detection and seed-based identification of multi-loci V(D)J recombinations in Vidjil-algo
Auteur(s) :
Borée, Cyprien [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Giraud, Mathieu [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Salson, Mikael [Auteur]
Bioinformatics and Sequence Analysis [BONSAI]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Giraud, Mathieu [Auteur]

Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Salson, Mikael [Auteur]

Bioinformatics and Sequence Analysis [BONSAI]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Mot(s)-clé(s) en anglais :
Spaced seeds
Aho-Corasick Automaton
Alignment-free algorithm
Immune repertoire
VDJ recombinations
Adaptive immune receptor repertoire
Aho-Corasick Automaton
Alignment-free algorithm
Immune repertoire
VDJ recombinations
Adaptive immune receptor repertoire
Discipline(s) HAL :
Informatique [cs]/Bio-informatique [q-bio.QM]
Résumé en anglais : [en]
The diversity of the immune repertoire is grounded on V(D)J recombinations in several loci. Many algorithms and software detect and designate these recombinations in high-throughput sequencing data. To improve their ...
Lire la suite >The diversity of the immune repertoire is grounded on V(D)J recombinations in several loci. Many algorithms and software detect and designate these recombinations in high-throughput sequencing data. To improve their efficiency, we propose a multi-loci seed identification through an Aho-Corasick like automaton as well as a seed-based gene filtration. These algorithms were implemented into Vidjil-algo, used routinely by several labs for the analysis of hematologic malignancies.We benchmark the results of Vidjil-algo and of MiXCR on five datasets, evaluating the specificity and sensitivity of the detection, as well as the adequation of the designation to manually curated sequences. Compared to the previous algorithms, the new algorithms implemented in Vidjil-algo bring speedups between 3× and 30×, with a smaller memory footprint and without quality loss in results. They enable to precisely annotate in a few minutes millions of sequences coming from V(D)J recombinations, including incomplete V(D)J-like recombinations, improving our knowledge on immune repertoires.Lire moins >
Lire la suite >The diversity of the immune repertoire is grounded on V(D)J recombinations in several loci. Many algorithms and software detect and designate these recombinations in high-throughput sequencing data. To improve their efficiency, we propose a multi-loci seed identification through an Aho-Corasick like automaton as well as a seed-based gene filtration. These algorithms were implemented into Vidjil-algo, used routinely by several labs for the analysis of hematologic malignancies.We benchmark the results of Vidjil-algo and of MiXCR on five datasets, evaluating the specificity and sensitivity of the detection, as well as the adequation of the designation to manually curated sequences. Compared to the previous algorithms, the new algorithms implemented in Vidjil-algo bring speedups between 3× and 30×, with a smaller memory footprint and without quality loss in results. They enable to precisely annotate in a few minutes millions of sequences coming from V(D)J recombinations, including incomplete V(D)J-like recombinations, improving our knowledge on immune repertoires.Lire moins >
Langue :
Anglais
Collections :
Source :
Fichiers
- document
- Accès libre
- Accéder au document
- vidjil-algo.pdf
- Accès libre
- Accéder au document
- vidjil-algo-supp.pdf
- Accès libre
- Accéder au document