Vers des sequences parfaites: correction ...
Document type :
Compte-rendu et recension critique d'ouvrage
Title :
Vers des sequences parfaites: correction de donnée de sequencage de seconde generation via alignement sur graphe de De Bruijn
Author(s) :
Limasset, Antoine [Auteur correspondant]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Flot, Jean-François [Auteur]
Evolutionary Biology and Ecology [Brussels]
Peterlongo, Pierre [Auteur]
Scalable, Optimized and Parallel Algorithms for Genomics [GenScale]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Flot, Jean-François [Auteur]
Evolutionary Biology and Ecology [Brussels]
Peterlongo, Pierre [Auteur]
Scalable, Optimized and Parallel Algorithms for Genomics [GenScale]
Journal title :
Bioinformatics
Publisher :
Oxford University Press (OUP)
Publication date :
2019-02-20
ISSN :
1367-4803
HAL domain(s) :
Informatique [cs]/Bio-informatique [q-bio.QM]
English abstract : [en]
Motivations Short-read accuracy is important for downstream analyses such as genome assembly and hybrid long-read correction. Despite much work on short-read correction, present-day correctors either do not scale well on ...
Show more >Motivations Short-read accuracy is important for downstream analyses such as genome assembly and hybrid long-read correction. Despite much work on short-read correction, present-day correctors either do not scale well on large data sets or consider reads as mere suites of k-mers, without taking into account their full-length read information. Results We propose a new method to correct short reads using de Bruijn graphs, and implement it as a tool called Bcool. As a first step, Bcool constructs a compacted de Bruijn graph from the reads. This graph is filtered on the basis of k-mer abundance then of unitig abundance, thereby removing from most sequencing errors. The cleaned graph is then used as a reference on which the reads are mapped to correct them. We show that this approach yields more accurate reads than k-mer-spectrum correctors while being scalable to human-size genomic datasets and beyond. Availability and ImplementationShow less >
Show more >Motivations Short-read accuracy is important for downstream analyses such as genome assembly and hybrid long-read correction. Despite much work on short-read correction, present-day correctors either do not scale well on large data sets or consider reads as mere suites of k-mers, without taking into account their full-length read information. Results We propose a new method to correct short reads using de Bruijn graphs, and implement it as a tool called Bcool. As a first step, Bcool constructs a compacted de Bruijn graph from the reads. This graph is filtered on the basis of k-mer abundance then of unitig abundance, thereby removing from most sequencing errors. The cleaned graph is then used as a reference on which the reads are mapped to correct them. We show that this approach yields more accurate reads than k-mer-spectrum correctors while being scalable to human-size genomic datasets and beyond. Availability and ImplementationShow less >
Language :
Anglais
Popular science :
Non
ANR Project :
Collections :
Source :
Files
- https://hal.inria.fr/hal-02407243/document
- Open access
- Access the document
- https://hal.inria.fr/hal-02407243/document
- Open access
- Access the document
- http://arxiv.org/pdf/1711.03336
- Open access
- Access the document
- https://hal.inria.fr/hal-02407243/document
- Open access
- Access the document
- document
- Open access
- Access the document
- main.pdf
- Open access
- Access the document
- 1711.03336
- Open access
- Access the document
- document
- Open access
- Access the document
- main.pdf
- Open access
- Access the document