Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
GroupLens: an open architecture for collaborative filtering of netnews
CSCW '94 Proceedings of the 1994 ACM conference on Computer supported cooperative work
Using Markov models for web site link prediction
Proceedings of the thirteenth ACM conference on Hypertext and hypermedia
Efficient Mining of Frequent Subgraphs in the Presence of Isomorphism
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
The link prediction problem for social networks
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Cyclic pattern kernels for predictive graph mining
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Boosting margin based distance functions for clustering
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Kernel methods for predicting protein--protein interactions
Bioinformatics
Knowledge and Information Systems
Protein-ligand interaction prediction
Bioinformatics
Feature Selection in the Tensor Product Feature Space
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Hi-index | 0.00 |
The analysis of protein-chemical reactions on a large scale is critical to understanding the complex interrelated mechanisms that govern biological life at the cellular level. Chemical proteomics is a new research area aimed at proteome-wide screening of such chemical-protein interactions. In order to model the diverse and complex chemical-protein interaction space, recent work on local models has emerged. Local models improve generalization by training a series of independent models each localized to predict a single interaction. One limitation of this approach is that the localized models are not tolerant to noise in the interaction labels, which is a characteristic of much protein-chemical interaction data. This work proposes and evaluates a boosting framework incorporating sample similarity to localize base models to appropriate regions of the interaction space, thereby ensuring that similar samples are given similar predictions and providing a measure of tolerance to noise in the training labels. The framework is described and compared to local models and several other competing classification methods. Chemical-protein interaction data sets are constructed from publicly available data, and a series of cross-validation experiments are performed in order to compare the noise tolerance, accuracy, sensitivity, and specificity of various methods.