A Coverage Criterion for Spaced Seeds and ...
Type de document :
Article dans une revue scientifique
DOI :
Titre :
A Coverage Criterion for Spaced Seeds and Its Applications to Support Vector Machine String Kernels and $k$-Mer Distances
Auteur(s) :
Noé, Laurent [Auteur]
Bioinformatics and Sequence Analysis [BONSAI]
Martin, Donald [Auteur]
North Carolina State University [Raleigh] [NC State]
Bioinformatics and Sequence Analysis [BONSAI]
Martin, Donald [Auteur]
North Carolina State University [Raleigh] [NC State]
Titre de la revue :
Journal of Computational Biology
Pagination :
28
Éditeur :
Mary Ann Liebert
Date de publication :
2014-12-01
ISSN :
1066-5277
Mot(s)-clé(s) en anglais :
Spaced seed
Spaced $k$-mer
Gapped $k$-mer
Coverage sensitivity
Support Vector Machine
String kernel
Alignment-free distance
Spaced $k$-mer
Gapped $k$-mer
Coverage sensitivity
Support Vector Machine
String kernel
Alignment-free distance
Discipline(s) HAL :
Informatique [cs]/Bio-informatique [q-bio.QM]
Sciences du Vivant [q-bio]/Bio-Informatique, Biologie Systémique [q-bio.QM]
Sciences du Vivant [q-bio]/Bio-Informatique, Biologie Systémique [q-bio.QM]
Résumé en anglais : [en]
Spaced seeds have been recently shown to not only detect more alignments, but also to give a more accurate measure of phylogenetic distances (Boden et al., 2013, Horwege et al., 2014, Leimeister et al., 2014), and to provide ...
Lire la suite >Spaced seeds have been recently shown to not only detect more alignments, but also to give a more accurate measure of phylogenetic distances (Boden et al., 2013, Horwege et al., 2014, Leimeister et al., 2014), and to provide a lower misclassification rate when used with Support Vector Machines (SVMs) (On-odera and Shibuya, 2013), We confirm by independent experiments these two results, and propose in this article to use a coverage criterion (Benson and Mak, 2008, Martin, 2013, Martin and Noé, 2014), to measure the seed efficiency in both cases in order to design better seed patterns. We show first how this coverage criterion can be directly measured by a full automaton-based approach. We then illustrate how this criterion performs when compared with two other criteria frequently used, namely the single-hit and multiple-hit criteria, through correlation coefficients with the correct classification/the true distance. At the end, for alignment-free distances, we propose an extension by adopting the coverage criterion, show how it performs, and indicate how it can be efficiently computed.Lire moins >
Lire la suite >Spaced seeds have been recently shown to not only detect more alignments, but also to give a more accurate measure of phylogenetic distances (Boden et al., 2013, Horwege et al., 2014, Leimeister et al., 2014), and to provide a lower misclassification rate when used with Support Vector Machines (SVMs) (On-odera and Shibuya, 2013), We confirm by independent experiments these two results, and propose in this article to use a coverage criterion (Benson and Mak, 2008, Martin, 2013, Martin and Noé, 2014), to measure the seed efficiency in both cases in order to design better seed patterns. We show first how this coverage criterion can be directly measured by a full automaton-based approach. We then illustrate how this criterion performs when compared with two other criteria frequently used, namely the single-hit and multiple-hit criteria, through correlation coefficients with the correct classification/the true distance. At the end, for alignment-free distances, we propose an extension by adopting the coverage criterion, show how it performs, and indicate how it can be efficiently computed.Lire moins >
Langue :
Anglais
Comité de lecture :
Oui
Audience :
Internationale
Vulgarisation :
Non
Collections :
Source :
Fichiers
- https://hal.inria.fr/hal-01083204/document
- Accès libre
- Accéder au document
- http://arxiv.org/pdf/1412.2587
- Accès libre
- Accéder au document
- https://hal.inria.fr/hal-01083204/document
- Accès libre
- Accéder au document
- https://hal.inria.fr/hal-01083204/document
- Accès libre
- Accéder au document
- document
- Accès libre
- Accéder au document
- coverage-sensitivity.pdf
- Accès libre
- Accéder au document
- 1412.2587
- Accès libre
- Accéder au document