Classification and knowledge discovery in protein databases
Journal of Biomedical Informatics - Special issue: Biomedical machine learning
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
Improving transcription factor binding site predictions by using randomised negative examples
IPCAT'12 Proceedings of the 9th international conference on Information Processing in Cells and Tissues
Hi-index | 0.00 |
The identification of cis-regulatory binding sites in DNA is a difficult problem in computational biology. To obtain a full understanding of the complex machinery embodied in genetic regulatory networks it is necessary to know both the identity of the regulatory transcription factors together with the location of their binding sites in the genome. We show that using an SVM together with data sampling, to integrate the results of individual algorithms specialised for the prediction of binding site locations, can produce significant improvements upon the original algorithms. These results make more tractable the expensive experimental procedure of actually verifying the predictions.