On subset seeds for protein alignment

Mikhail Roytberg, Anna Gambin, Laurent Noé, Slawomir Lasota, Eugenia Furletova, Ewa Szczurek, Gregory Kucherov

Результат исследований: Вклад в журналСтатьярецензирование

17 Цитирования (Scopus)

Аннотация

Abstract-We apply the concept of subset seeds proposed in [1] to similarity search in protein sequences. The main question studied is the design of efficient seed alphabets to construct seeds with optimal sensitivity/selectivity trade-offs. We propose several different design methods and use them to construct several alphabets. We then perform a comparative analysis of seeds built over those alphabets and compare them with the standard BLASTP seeding method [2], [3], as well as with the family of vector seeds proposed in [4]. While the formalism of subset seeds is less expressive (but less costly to implement) than the cumulative principle used in BLASTP and vector seeds, our seeds show a similar or even better performance than BLASTP on Bernoulli models of proteins compatible with the common BLOSUM62 matrix. Finally, we perform a large-scale benchmarking of our seeds against several main databases of protein alignments. Here again, the results show a comparable or better performance of our seeds versus BLASTP.

Язык оригиналаАнглийский
Номер статьи4752807
Страницы (с-по)483-494
Число страниц12
ЖурналIEEE/ACM Transactions on Computational Biology and Bioinformatics
Том6
Номер выпуска3
DOI
СостояниеОпубликовано - июл. 2009
Опубликовано для внешнего пользованияДа

Fingerprint

Подробные сведения о темах исследования «On subset seeds for protein alignment». Вместе они формируют уникальный семантический отпечаток (fingerprint).

Цитировать