Gene recognition in eukaryotic DNA by comparison of genomic sequences

P. S. Novichkov, M. S. Gelfand, A. A. Mironov

Research output: Contribution to journalArticlepeer-review

29 Citations (Scopus)


Motivation: Sequencing of complete eukaryotic genomes and large syntenic fragments of genomes makes it possible to apply genomic comparison for gene recognition. Results: This paper describes a spliced alignment algorithm that aligns candidate exon chains of two homologous genomic sequence fragments from different species. The algorithm is implemented in Pro-Gen software. Unlike other algorithms, Pro-Gen does not assume conservation of the exon-intron structure. Amino acid sequences obtained by the formal translation of candidate exons are aligned instead of nucleotide sequences, which allows for distant comparisons. The algorithm was tested on a sample of human-mammal (mouse), human-vertebrate (Xenopus) and human-invertebrate (Drosophila) gene pairs. Surprisingly, the best results, 97-98% correlation between the actual and predicted genes, were obtained for more distant comparisons, whereas the correlation on the human-mouse sample was only 93%. The latter value increases to 95% if conservation of the exon-intron structure is assumed. This is caused by a large amount of sequence conservation in non-coding regions of the human and mouse genes probably due to regulatory elements.

Original languageEnglish
Pages (from-to)1011-1018
Number of pages8
Issue number11
Publication statusPublished - 2001
Externally publishedYes


Dive into the research topics of 'Gene recognition in eukaryotic DNA by comparison of genomic sequences'. Together they form a unique fingerprint.

Cite this