Segmentation of yeast DNA using hidden Markov models

Leonid Peshkin, Mikhail S. Gelfand

Research output: Contribution to journalArticlepeer-review

26 Citations (Scopus)

Abstract

Motivation: Compositionally homogeneous segments of genomic DNA often correspond to meaningful biological units. Simple sliding window analysis is usually insufficient for compositional segmentation of natural sequences. Hidden Markov models (HMM) with a small number of states are a natural language for description of compositional properties of chromosome-size DNA sequences. Results: The algorithms were applied to yeast Saccharomyces cerevisiae chromosomes (YC) I, III, IV,VI and IX. The optimal number of HMM states is found to be four. The optimal four-state HMMs for all chromosomes are very similar, as well as the reconstructed segmentations. In most cases the models with k + 1 states are obtained by 'splitting' one of the states in the model with k states, and the corresponding increase of the level of detail in segmentation. The high AT states usually correspond to intergenic regions. We also explore the model's likelihood landscape and analyze the dynamics of the optimization process, thus addressing the problem of reliability of the obtained optima and efficiency of the algorithms. Availability: The system is available on request from the first author.

Original languageEnglish
Pages (from-to)980-986
Number of pages7
JournalBioinformatics
Volume15
Issue number12
DOIs
Publication statusPublished - Dec 1999
Externally publishedYes

Fingerprint

Dive into the research topics of 'Segmentation of yeast DNA using hidden Markov models'. Together they form a unique fingerprint.

Cite this