Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Predicting Binding Sites in the Mouse Genome
ICMLA '07 Proceedings of the Sixth International Conference on Machine Learning and Applications
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
Effect of using varying negative examples in transcription factor binding site predictions
EvoBIO'11 Proceedings of the 9th European conference on Evolutionary computation, machine learning and data mining in bioinformatics
Improving transcription factor binding site predictions by using randomised negative examples
IPCAT'12 Proceedings of the 9th international conference on Information Processing in Cells and Tissues
Hi-index | 0.00 |
Computational prediction of cis-regulatory binding sites is widely acknowledged as a difficult task. There are many different algorithms for searching for binding sites in current use. However, most of them produce a high rate of false positive predictions. Moreover, many algorithmic approaches are inherently constrained with respect to the range of binding sites that they can be expected to reliably predict. We propose to use SVMs to predict binding sites from multiple sources of evidence. We combine random selection under-sampling and the synthetic minority over-sampling technique to deal with the imbalanced nature of the data. In addition, we remove some of the final predicted binding sites on the basis of their biological plausibility. The results show that we can generate a new prediction that significantly improves on the performance of any one of the individual prediction algorithms.