Similarity boosting for label noise tolerance in protein-chemical interaction prediction

Authors:
Aaron Smalter Hall;Jun Huan;Gerald Lushington
Affiliations:
University of Kansas, Lawrence, Kansas;University of Kansas, Lawrence, Kansas;University of Kansas, Lawrence, Kansas
Venue:
Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Year:
2011

Citing 16
Cited 0

Automatic text processing: the transformation, analysis, and retrieval of information by computer

Automatic text processing: the transformation, analysis, and retrieval of information by computer
GroupLens: an open architecture for collaborative filtering of netnews

CSCW '94 Proceedings of the 1994 ACM conference on Computer supported cooperative work
Using Markov models for web site link prediction

Proceedings of the thirteenth ACM conference on Hypertext and hypermedia
Efficient Mining of Frequent Subgraphs in the Presence of Isomorphism

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
The link prediction problem for social networks

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Cyclic pattern kernels for predictive graph mining

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Boosting margin based distance functions for clustering

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Kernel methods for predicting protein--protein interactions

Bioinformatics
Predicting protein--protein interactions using signature products

Bioinformatics
Supervised tensor learning

Knowledge and Information Systems
Statistical prediction of protein–chemical interactions based on chemical structure and mass spectrometry data

Bioinformatics
Genome scale enzyme–metabolite and drug–target interaction predictions using the signature molecular descriptor

Bioinformatics
Prediction of drug–target interaction networks from the integration of chemical and genomic spaces

Bioinformatics
Protein-ligand interaction prediction

Bioinformatics
Supervised prediction of drug–target interactions using bipartite local models

Bioinformatics
Feature Selection in the Tensor Product Feature Space

ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

The analysis of protein-chemical reactions on a large scale is critical to understanding the complex interrelated mechanisms that govern biological life at the cellular level. Chemical proteomics is a new research area aimed at proteome-wide screening of such chemical-protein interactions. In order to model the diverse and complex chemical-protein interaction space, recent work on local models has emerged. Local models improve generalization by training a series of independent models each localized to predict a single interaction. One limitation of this approach is that the localized models are not tolerant to noise in the interaction labels, which is a characteristic of much protein-chemical interaction data. This work proposes and evaluates a boosting framework incorporating sample similarity to localize base models to appropriate regions of the interaction space, thereby ensuring that similar samples are given similar predictions and providing a measure of tolerance to noise in the training labels. The framework is described and compared to local models and several other competing classification methods. Chemical-protein interaction data sets are constructed from publicly available data, and a series of cross-validation experiments are performed in order to compare the noise tolerance, accuracy, sensitivity, and specificity of various methods.