Using Real-Valued Meta Classifiers to Integrate and Contextualize Binding Site Predictions

  • Authors:
  • Mark Robinson;Offer Sharabi;Yi Sun;Rod Adams;Rene Boekhorst;Alistair G. Rust;Neil Davey

  • Affiliations:
  • University of Hertfordshire, College Lane, Hatfield, Hertfordshire AL10 9AB, Great Britain;University of Hertfordshire, College Lane, Hatfield, Hertfordshire AL10 9AB, Great Britain;University of Hertfordshire, College Lane, Hatfield, Hertfordshire AL10 9AB, Great Britain;University of Hertfordshire, College Lane, Hatfield, Hertfordshire AL10 9AB, Great Britain;University of Hertfordshire, College Lane, Hatfield, Hertfordshire AL10 9AB, Great Britain;University of Hertfordshire, College Lane, Hatfield, Hertfordshire AL10 9AB, Great Britain;University of Hertfordshire, College Lane, Hatfield, Hertfordshire AL10 9AB, Great Britain

  • Venue:
  • ICANNGA '07 Proceedings of the 8th international conference on Adaptive and Natural Computing Algorithms, Part I
  • Year:
  • 2007

Quantified Score

Hi-index 0.01

Visualization

Abstract

Currently the best algorithms for transcription factor binding site predictions are severely limited in accuracy. However, a non-linear combination of these algorithms could improve the quality of predictions. A support-vector machine was applied to combine the predictions of 12 key real valued algorithms. The data was divided into a training set and a test set, of which two were constructed: filtered and unfiltered. In addition, a different "window" of consecutive results was used in the input vector in order to contextualize the neighbouring results. Finally, classification results were improved with the aid of under and over sampling techniques. Our major finding is that we can reduce the False-Positive rate significantly. We also found that the bigger the window, the higher the F-score, but the more likely it is to make a false positive prediction, with the best trade-off being a window size of about 7.