Similarity boosting for label noise tolerance in protein-chemical interaction prediction
Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 3.84 |
Motivation: Prediction of interactions between proteins and chemical compounds is of great benefit in drug discovery processes. In this field, 3D structure-based methods such as docking analysis have been developed. However, the genomewide application of these methods is not really feasible as 3D structural information is limited in availability. Results: We describe a novel method for predicting protein–chemical interaction using SVM. We utilize very general protein data, i.e. amino acid sequences, and combine these with chemical structures and mass spectrometry (MS) data. MS data can be of great use in finding new chemical compounds in the future. We assessed the validity of our method in the dataset of the binding of existing drugs and found that more than 80% accuracy could be obtained. Furthermore, we conducted comprehensive target protein predictions for MDMA, and validated the biological significance of our method by successfully finding proteins relevant to its known functions. Availability: Available on request from the authors. Contact: yasu@bio.keio.ac.jp Supplementary information: Appendix–technical details of method, Supplementary Table 1–7 and Supplementary Figure 1.