Efficient seeding techniques for protein similarity search

Mikhail Roytberg, Anna Gambin, Laurent Noé, Sławomir Lasota, Eugenia Furletova, Ewa Szczurek, Gregory Kucherov

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Citations (Scopus)


We apply the concept of subset seeds proposed in [1] to similarity search in protein sequences. The main question studied is the design of efficient seed alphabets to construct seeds with optimal sensitivity/ selectivity trade-offs. We propose several different design methods and use them to construct several alphabets.We then perform an analysis of seeds built over those alphabet and compare them with the standard Blastp seeding method [2,3], as well as with the family of vector seeds proposed in [4]. While the formalism of subset seed is less expressive (but less costly to implement) than the accumulative principle used in Blastp and vector seeds, our seeds show a similar or even better performance than Blastp on Bernoulli models of proteins compatible with the common BLOSUM62 matrix.

Original languageEnglish
Title of host publicationBioinformatics Research and Development - Second International Conference, BIRD 2008, Proceedings
PublisherSpringer Verlag
Number of pages13
ISBN (Print)9783540705987
Publication statusPublished - 2008
Externally publishedYes
Event2nd International Conference on Bioinformatics Research and Development, BIRD 2008 - Vienna, Austria
Duration: 7 Jul 20089 Jul 2008

Publication series

NameCommunications in Computer and Information Science
ISSN (Print)1865-0929


Conference2nd International Conference on Bioinformatics Research and Development, BIRD 2008


Dive into the research topics of 'Efficient seeding techniques for protein similarity search'. Together they form a unique fingerprint.

Cite this