Machine Learning
Linkage and Autocorrelation Cause Feature Selection Bias in Relational Learning
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Kernel Methods for Pattern Analysis
Kernel Methods for Pattern Analysis
Why collective inference improves relational classification
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
A hint to search for metalloproteins in gene banks
Bioinformatics
Discriminative probabilistic models for relational data
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Hi-index | 0.00 |
We describe and empirically evaluate machine learning methods for the prediction of zinc binding sites from protein sequences. We start by observing that a data set consisting of single residues as examples is affected by autocorrelation and we propose an ad-hoc remedy in which sequentially close pairs of candidate residues are classified as being jointly involved in the coordination of a zinc ion. We develop a kernel for this particular type of data that can handle variable length gaps between candidate coordinating residues. Our empirical evaluation on a data set of non redundant protein chains shows that explicit modeling the correlation between residues close in sequence allows us to gain a significant improvement in the prediction performance.