Exact mapping of prokaryotic gene starts.

Mikhail V. Baytaluk, Mikhail S. Gelfand, Andrey A. Mironov

Research output: Contribution to journalArticlepeer-review

6 Citations (Scopus)

Abstract

It is known that while the programs used to find genes in prokaryotic genomes reliably map protein-coding regions, they often fail in the exact determination of gene starts. This problem is further aggravated by sequencing errors, most notably insertions and deletions leading to frame-shifts. Therefore, the exact mapping of gene starts and identification of frame-shifts are important problems of the computer-assisted functional analysis of newly sequenced genomes. Here we review methods of gene recognition and describe a new algorithm for correction of gene starts and identification of frame-shifts in prokaryotic genomes. The algorithm is based on the comparison of nucleotide and protein sequences of homologous genes from related organisms, using the assumption that the rate of evolutionary changes in protein-coding regions is lower than that in non-coding regions. A dynamic programming algorithm is used to align protein sequences obtained by formal translation of genomic nucleotide sequences. The possibility of frame-shifts is taken into account. The algorithm was tested on several groups of related organisms: gamma-proteobacteria, the Bacillus/Clostridium group, and three Pyrococcus genomes. The testing demonstrated that, dependent or a genome, 1-10 per cent of genes have incorrect starts or contain frame-shifts. The algorithm is implemented in the program package Orthologator-GeneCorrector.

Original languageEnglish
Pages (from-to)181-194
Number of pages14
JournalBriefings in bioinformatics
Volume3
Issue number2
DOIs
Publication statusPublished - Jun 2002
Externally publishedYes

Fingerprint

Dive into the research topics of 'Exact mapping of prokaryotic gene starts.'. Together they form a unique fingerprint.

Cite this