A generalization of Haussler's convolution kernel: mapping kernel

Authors:
Kilho Shin;Tetsuji Kuboyama
Affiliations:
Carnegie Mellon CyLab Japan, Kobe, Hyogo, Japan;Gakushuin University, Tokyo, Japan
Venue:
Proceedings of the 25th international conference on Machine learning
Year:
2008

Citing 11
Cited 8

On the complexity of comparing evolutionary trees

Discrete Applied Mathematics - Special volume on computational molecular biology
On a relation between graph edit distance and maximum common subgraph

Pattern Recognition Letters
The String-to-String Correction Problem

Journal of the ACM (JACM)
The Tree-to-Tree Correction Problem

Journal of the ACM (JACM)
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Matching Free Trees, Maximal Cliques, and Monotone Game Dynamics

IEEE Transactions on Pattern Analysis and Machine Intelligence
Kernels for Semi-Structured Data

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Mismatch string kernels for discriminative protein classification

Bioinformatics
Weighted decomposition kernels

ICML '05 Proceedings of the 22nd international conference on Machine learning
Rooted Maximum Agreement Supertrees

Algorithmica
Convolution kernels with feature selection for natural language processing tasks

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics

Kernels Based on Distributions of Agreement Subtrees

AI '08 Proceedings of the 21st Australasian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
Route kernels for trees

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Kernels for Periodic Time Series Arising in Astronomy

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
A study of convolution tree kernel with local alignment

GEMS '09 Proceedings of the Workshop on Geometrical Models of Natural Language Semantics
Mining structured data

IEEE Computational Intelligence Magazine
A generalization of Haussler's convolution kernel: mapping kernel and its application to tree kernels

Journal of Computer Science and Technology
Improving graph-based random walks for complex question answering using syntactic, shallow semantic and extended string subsequence kernels

Information Processing and Management: an International Journal
Matchmaking OWL-S processes: an approach based on path signatures

Proceedings of the International Conference on Management of Emergent Digital EcoSystems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Haussler's convolution kernel provides a successful framework for engineering new positive semidefinite kernels, and has been applied to a wide range of data types and applications. In the framework, each data object represents a finite set of finer grained components. Then, Haussler's convolution kernel takes a pair of data objects as input, and returns the sum of the return values of the predetermined primitive positive semidefinite kernel calculated for all the possible pairs of the components of the input data objects. On the other hand, the mapping kernel that we introduce in this paper is a natural generalization of Haussler's convolution kernel, in that the input to the primitive kernel moves over a predetermined subset rather than the entire cross product. Although we have plural instances of the mapping kernel in the literature, their positive semidefiniteness was investigated in case-by-case manners, and worse yet, was sometimes incorrectly concluded. In fact, there exists a simple and easily checkable necessary and sufficient condition, which is generic in the sense that it enables us to investigate the positive semidefiniteness of an arbitrary instance of the mapping kernel. This is the first paper that presents and proves the validity of the condition. In addition, we introduce two important instances of the mapping kernel, which we refer to as the size-of-index-structure-distribution kernel and the editcost-distribution kernel. Both of them are naturally derived from well known (dis)similarity measurements in the literature (e.g. the maximum agreement tree, the edit distance), and are reasonably expected to improve the performance of the existing measures by evaluating their distributional features rather than their peak (maximum/minimum) features.