Efficient seeding techniques for protein ...
Type de document :
Communication dans un congrès avec actes
Titre :
Efficient seeding techniques for protein similarity search
Auteur(s) :
Roytberg, Mihkail [Auteur]
Institute of Mathematical Problems in Biology [IMPB RAS]
Gambin, Anna [Auteur]
Institute of Informatics [Warsaw]
Noé, Laurent [Auteur correspondant]
Sequential Learning [SEQUOIA]
Laboratoire d'Informatique Fondamentale de Lille [LIFL]
Lasota, Slawomir [Auteur]
Institute of Informatics [Warsaw]
Furletova, Eugenia [Auteur]
Institute of Mathematical Problems in Biology [IMPB RAS]
Szczurek, Ewa [Auteur]
Department Computational Molecular Biology [MPIMG Berlin]
Kucherov, Gregory [Auteur]
Sequential Learning [SEQUOIA]
Laboratoire d'Informatique Fondamentale de Lille [LIFL]
Institute of Mathematical Problems in Biology [IMPB RAS]
Gambin, Anna [Auteur]
Institute of Informatics [Warsaw]
Noé, Laurent [Auteur correspondant]

Sequential Learning [SEQUOIA]
Laboratoire d'Informatique Fondamentale de Lille [LIFL]
Lasota, Slawomir [Auteur]
Institute of Informatics [Warsaw]
Furletova, Eugenia [Auteur]
Institute of Mathematical Problems in Biology [IMPB RAS]
Szczurek, Ewa [Auteur]
Department Computational Molecular Biology [MPIMG Berlin]
Kucherov, Gregory [Auteur]
Sequential Learning [SEQUOIA]
Laboratoire d'Informatique Fondamentale de Lille [LIFL]
Éditeur(s) ou directeur(s) scientifique(s) :
Elloumi
M and K\"{u}ng
J. and Linial
M. and Murphy
R.F. and Schneider
K. and Toma
C.
M and K\"{u}ng
J. and Linial
M. and Murphy
R.F. and Schneider
K. and Toma
C.
Titre de la manifestation scientifique :
Proceedings of the 2nd International Conference BIRD
Ville :
Vienna
Pays :
Autriche
Date de début de la manifestation scientifique :
2008-07
Titre de la revue :
Communications in Computer and Information Science
Éditeur :
Springer Berlin Heidelberg
Date de publication :
2008
Mot(s)-clé(s) en anglais :
spaced seeds
subset seeds
protein similarity search
subset seeds
protein similarity search
Discipline(s) HAL :
Informatique [cs]/Bio-informatique [q-bio.QM]
Sciences du Vivant [q-bio]/Bio-Informatique, Biologie Systémique [q-bio.QM]
Sciences du Vivant [q-bio]/Bio-Informatique, Biologie Systémique [q-bio.QM]
Résumé en anglais : [en]
We apply the concept of subset seeds proposed in [1] to similarity search in protein sequences. The main question studied is the design of efficient seed alphabets to construct seeds with optimal sensitivity/selectivity ...
Lire la suite >We apply the concept of subset seeds proposed in [1] to similarity search in protein sequences. The main question studied is the design of efficient seed alphabets to construct seeds with optimal sensitivity/selectivity trade-offs. We propose several different design methods and use them to construct several alphabets.We then perform an analysis of seeds built over those alphabet and compare them with the standard Blastp seeding method [2,3], as well as with the family of vector seeds proposed in [4]. While the formalism of subset seed is less expressive (but less costly to implement) than the accumulative principle used in Blastp and vector seeds, our seeds show a similar or even better performance than Blastp on Bernoulli models of proteins compatible with the common BLOSUM62 matrix.Lire moins >
Lire la suite >We apply the concept of subset seeds proposed in [1] to similarity search in protein sequences. The main question studied is the design of efficient seed alphabets to construct seeds with optimal sensitivity/selectivity trade-offs. We propose several different design methods and use them to construct several alphabets.We then perform an analysis of seeds built over those alphabet and compare them with the standard Blastp seeding method [2,3], as well as with the family of vector seeds proposed in [4]. While the formalism of subset seed is less expressive (but less costly to implement) than the accumulative principle used in Blastp and vector seeds, our seeds show a similar or even better performance than Blastp on Bernoulli models of proteins compatible with the common BLOSUM62 matrix.Lire moins >
Langue :
Anglais
Comité de lecture :
Oui
Audience :
Internationale
Vulgarisation :
Non
Collections :
Source :
Fichiers
- https://hal.inria.fr/inria-00335564/document
- Accès libre
- Accéder au document
- http://arxiv.org/pdf/0810.5434
- Accès libre
- Accéder au document
- https://hal.inria.fr/inria-00335564/document
- Accès libre
- Accéder au document
- document
- Accès libre
- Accéder au document
- paper.pdf
- Accès libre
- Accéder au document
- 0810.5434
- Accès libre
- Accéder au document
- document
- Accès libre
- Accéder au document
- paper.pdf
- Accès libre
- Accéder au document