Machine learning study of DNA binding by transcription factors from the LacI family

Gennady G. Fedonin, Mikhail S. Gelfand

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We studied 1372 LacI-family transcription factors and their 4484 DNA binding sites using machine learning algorithms and feature selection techniques. The Naive Bayes classifier and Logistic Regression were used to predict binding sites given transcription factor sequences. Prediction accuracy was estimated using 10-fold cross-validation. Experiments showed that the best prediction of nucleotide densities at selected site positions is obtained using only a few key protein sequence positions. These positions are stably selected by the forward feature selection based on the mutual information of factor-site position pairs.

Original languageEnglish
Title of host publicationPattern Recognition in Bioinformatics - 5th IAPR International Conference, PRIB 2010, Proceedings
Pages15-26
Number of pages12
DOIs
Publication statusPublished - 2010
Externally publishedYes
Event5th IAPR International Conference on Pattern Recognition in Bioinformatics, PRIB 2010 - Nijmegen, Netherlands
Duration: 22 Sep 201024 Sep 2010

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume6282 LNBI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference5th IAPR International Conference on Pattern Recognition in Bioinformatics, PRIB 2010
Country/TerritoryNetherlands
CityNijmegen
Period22/09/1024/09/10

Keywords

  • logistic regression
  • mutual information
  • naive Bayes classifier
  • transcription factors

Fingerprint

Dive into the research topics of 'Machine learning study of DNA binding by transcription factors from the LacI family'. Together they form a unique fingerprint.

Cite this