Predicting protein-protein interactions with k-nearest neighbors classification algorithm

  • Authors:
  • Mario R. Guarracino;Adriano Nebbia

  • Affiliations:
  • High Performance Computing and Networking Institute, National Research Council, Napoli, Italy;High Performance Computing and Networking Institute, National Research Council, Napoli, Italy

  • Venue:
  • CIBB'09 Proceedings of the 6th international conference on Computational intelligence methods for bioinformatics and biostatistics
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this work we address the problem of predicting protein-protein interactions. Its solution can give greater insight in the study of complex diseases, like cancer, and provides valuable information in the study of active small molecules for new drugs, limiting the number of molecules to be tested in laboratory. We model the problem as a binary classification task, using a suitable coding of the amino acid sequences. We apply k-Nearest Neighbors classification algorithm to the classes of interacting and noninteracting proteins. Results show that it is possible to achieve high prediction accuracy in cross validation. A case study is analyzed to show it is possible to reconstruct a real network of thousands interacting proteins with high accuracy on standard hardware.