Motif Discovery as a Multiple-Instance Problem

  • Authors:
  • Ya Zhang;Yixin Chen;Xiang Ji

  • Affiliations:
  • University of Kansas, USA;The University of Mississippi, USA;Yahoo! Inc.

  • Venue:
  • ICTAI '06 Proceedings of the 18th IEEE International Conference on Tools with Artificial Intelligence
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Motif discovery from biosequences, a challenging task both experimentally and computationally, has been a topic of immense study in recent years. In this paper, we formulate the motif discovery problem as a multiple-instance problem and employ a multiple-instance learning method, the MILES method, to identify motif from biological sequences. Each sequence is mapped into a feature space defined by instances in training sequences with a novel instance-bag similarity measure. We employ 1-norm SVM to select important features and construct classifiers simultaneously. These high-ranked features correspond to discovered motifs. We apply this method to discover transcriptional factor binding sites in promoters, a typical motif finding problem in biology, and show that the method is at least comparable to existing methods.