A training algorithm for optimal margin classifiers
COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Efficient Algorithms for Shortest Paths in Sparse Networks
Journal of the ACM (JACM)
Communications of the ACM
Kernel Methods for Pattern Analysis
Kernel Methods for Pattern Analysis
Exact and Approximate Graph Matching Using Random Walks
IEEE Transactions on Pattern Analysis and Machine Intelligence
Optimal assignment kernels for attributed molecular graphs
ICML '05 Proceedings of the 22nd international conference on Machine learning
Shortest-Path Kernels on Graphs
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
2005 Speical Issue: Graph kernels for chemical informatics
Neural Networks - Special issue on neural networks and kernel methods for structured domains
Pairwise global alignment of protein interaction networks by matching neighborhood topology
RECOMB'07 Proceedings of the 11th annual international conference on Research in computational molecular biology
Bipartite graph matching for computing the edit distance of graphs
GbRPR'07 Proceedings of the 6th IAPR-TC-15 international conference on Graph-based representations in pattern recognition
Hi-index | 0.01 |
Approaches that can predict the biological activity or properties of a chemical compound are an important application of machine learning. In this paper, we introduce a new kernel function for measuring the similarity between chemical compounds and for learning their related properties and activities. The method is based on local atom pair environments which can be rapidly computed by using the topological all-shortest paths matrix and the geometrical distance matrix of a molecular graph as lookup tables. The local atom pair environments are stored in prefix search trees, so called tries, for an efficient comparison. The kernel can be either computed as an optimal assignment kernel or as a corresponding convolution kernel over all local atom similarities. We implemented the Tanimoto kernel, min kernel, minmax kernel and the dot product kernel as local kernels, which are computed recursively by traversing the tries. We tested the approach on eight structure-activity and structure-property molecule benchmark data sets from the literature. The models were trained with @e- support vector regression and support vector classification. The local atom pair kernels showed to be at least competitive to state-of-the-art kernels in seven out of eight cases in a direct comparison. A comparison against literature results using similar experimental setups as in the original works confirmed these findings. The method is easy to implement and has robust default parameters.