Prediction of Transcription Factor Families Using DNA Sequence Features

  • Authors:
  • Ashish Anand;Gary B. Fogel;Ganesan Pugalenthi;P. N. Suganthan

  • Affiliations:
  • School of Electrical and Electonic Engineering, Nanyang Technological University, Singapore 639798;Natural Selection, San Diego CA 92121;School of Electrical and Electonic Engineering, Nanyang Technological University, Singapore 639798;School of Electrical and Electonic Engineering, Nanyang Technological University, Singapore 639798

  • Venue:
  • PRIB '08 Proceedings of the Third IAPR International Conference on Pattern Recognition in Bioinformatics
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Understanding the mechanisms of protein-DNA interaction is of critical importance in biology. Transcription factor (TF) binding to a specific DNA sequence depends on at least two factors: A protein-level DNA-binding domain and a nucleotide-level specific sequence serving as a TF binding site. TFs have been classified into families based on these factors. TFs within each family bind to specific nucleotide sequences in a very similar fashion. Identification of the TF family that might bind at a particular nucleotide sequence requires a machine learning approach. Here we considered two sets of features based on DNA sequences and their physicochemical properties and applied a one-versus-all SVM (OVA-SVM) with class-wise optimized features to identify TF family-specific features in DNA sequences. Using this approach, a mean prediction accuracy of ~80% was achieved, which represents an improvement of ~7% over previous approaches on the same data.