Assessment of Common and Emerging ...
Type de document :
Article dans une revue scientifique: Article original
Titre :
Assessment of Common and Emerging Bioinformatics Pipelines for Targeted Metagenomics
Auteur(s) :
Siegwald, Léa [Auteur]
Bioinformatics and Sequence Analysis [BONSAI]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Touzet, Helene [Auteur]
Bioinformatics and Sequence Analysis [BONSAI]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Lemoine, Yves [Auteur]
Institut Pasteur de Lille
Centre d’Infection et d’Immunité de Lille - INSERM U 1019 - UMR 9017 - UMR 8204 [CIIL]
Hot, David [Auteur]
Institut Pasteur de Lille
Centre d’Infection et d’Immunité de Lille - INSERM U 1019 - UMR 9017 - UMR 8204 [CIIL]
Audebert, Christophe [Auteur]
Institut Pasteur de Lille
Caboche, Ségolène [Auteur]
Institut Pasteur de Lille
Centre d’Infection et d’Immunité de Lille - INSERM U 1019 - UMR 9017 - UMR 8204 [CIIL]
Bioinformatics and Sequence Analysis [BONSAI]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Touzet, Helene [Auteur]

Bioinformatics and Sequence Analysis [BONSAI]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Lemoine, Yves [Auteur]
Institut Pasteur de Lille
Centre d’Infection et d’Immunité de Lille - INSERM U 1019 - UMR 9017 - UMR 8204 [CIIL]
Hot, David [Auteur]

Institut Pasteur de Lille
Centre d’Infection et d’Immunité de Lille - INSERM U 1019 - UMR 9017 - UMR 8204 [CIIL]
Audebert, Christophe [Auteur]
Institut Pasteur de Lille
Caboche, Ségolène [Auteur]
Institut Pasteur de Lille
Centre d’Infection et d’Immunité de Lille - INSERM U 1019 - UMR 9017 - UMR 8204 [CIIL]
Titre de la revue :
PLOS ONE
Pagination :
e0169563
Éditeur :
Public Library of Science
Date de publication :
2017
ISSN :
1932-6203
Discipline(s) HAL :
Sciences du Vivant [q-bio]/Bio-Informatique, Biologie Systémique [q-bio.QM]
Informatique [cs]/Bio-informatique [q-bio.QM]
Informatique [cs]/Bio-informatique [q-bio.QM]
Résumé en anglais : [en]
Targeted metagenomics, also known as metagenetics, is a high-throughput sequencing application focusing on a nucleotide target in a microbiome to describe its taxonomic content. A wide range of bioinformatics pipelines are ...
Lire la suite >Targeted metagenomics, also known as metagenetics, is a high-throughput sequencing application focusing on a nucleotide target in a microbiome to describe its taxonomic content. A wide range of bioinformatics pipelines are available to analyze sequencing outputs, and the choice of an appropriate tool is crucial and not trivial. No standard evaluation method exists for estimating the accuracy of a pipeline for targeted metagenomics analyses. This article proposes an evaluation protocol containing real and simulated targeted metage-nomics datasets, and adequate metrics allowing us to study the impact of different variables on the biological interpretation of results. This protocol was used to compare six different bioinformatics pipelines in the basic user context: Three common ones (mothur, QIIME and BMP) based on a clustering-first approach and three emerging ones (Kraken, CLARK and One Codex) using an assignment-first approach. This study surprisingly reveals that the effect of sequencing errors has a bigger impact on the results that choosing different amplified regions. Moreover, increasing sequencing throughput increases richness overestima-tion, even more so for microbiota of high complexity. Finally, the choice of the reference database has a bigger impact on richness estimation for clustering-first pipelines, and on correct taxa identification for assignment-first pipelines. Using emerging assignment-first pipelines is a valid approach for targeted metagenomics analyses, with a quality of results comparable to popular clustering-first pipelines, even with an error-prone sequencing technology like Ion Torrent. However, those pipelines are highly sensitive to the quality of databases and their annotations, which makes clustering-first pipelines still the only reliable approach for studying microbiomes that are not well described.Lire moins >
Lire la suite >Targeted metagenomics, also known as metagenetics, is a high-throughput sequencing application focusing on a nucleotide target in a microbiome to describe its taxonomic content. A wide range of bioinformatics pipelines are available to analyze sequencing outputs, and the choice of an appropriate tool is crucial and not trivial. No standard evaluation method exists for estimating the accuracy of a pipeline for targeted metagenomics analyses. This article proposes an evaluation protocol containing real and simulated targeted metage-nomics datasets, and adequate metrics allowing us to study the impact of different variables on the biological interpretation of results. This protocol was used to compare six different bioinformatics pipelines in the basic user context: Three common ones (mothur, QIIME and BMP) based on a clustering-first approach and three emerging ones (Kraken, CLARK and One Codex) using an assignment-first approach. This study surprisingly reveals that the effect of sequencing errors has a bigger impact on the results that choosing different amplified regions. Moreover, increasing sequencing throughput increases richness overestima-tion, even more so for microbiota of high complexity. Finally, the choice of the reference database has a bigger impact on richness estimation for clustering-first pipelines, and on correct taxa identification for assignment-first pipelines. Using emerging assignment-first pipelines is a valid approach for targeted metagenomics analyses, with a quality of results comparable to popular clustering-first pipelines, even with an error-prone sequencing technology like Ion Torrent. However, those pipelines are highly sensitive to the quality of databases and their annotations, which makes clustering-first pipelines still the only reliable approach for studying microbiomes that are not well described.Lire moins >
Langue :
Anglais
Comité de lecture :
Oui
Audience :
Internationale
Vulgarisation :
Non
Source :
Fichiers
- https://hal.archives-ouvertes.fr/hal-01575755/document
- Accès libre
- Accéder au document
- https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0169563&type=printable
- Accès libre
- Accéder au document
- https://hal.archives-ouvertes.fr/hal-01575755/document
- Accès libre
- Accéder au document
- https://hal.archives-ouvertes.fr/hal-01575755/document
- Accès libre
- Accéder au document
- document
- Accès libre
- Accéder au document
- journal.pone.0169563.pdf
- Accès libre
- Accéder au document
- file
- Accès libre
- Accéder au document
- document
- Accès libre
- Accéder au document
- journal.pone.0169563.pdf
- Accès libre
- Accéder au document