The nature of statistical learning theory
The nature of statistical learning theory
Kernels for Semi-Structured Data
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Text classification using string kernels
The Journal of Machine Learning Research
A survey of kernels for structured data
ACM SIGKDD Explorations Newsletter
A study on convolution kernels for shallow semantic parsing
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Efficient convolution kernels for dependency and constituent syntactic trees
ECML'06 Proceedings of the 17th European conference on Machine Learning
Hi-index | 0.00 |
In this paper, we discuss tree kernels that can be applied for the classification of XML documents based on their DOM trees. DOM trees are ordered trees, in which every node might be labeled by a vector of attributes including its XML tag and the textual content. We describe four new kernels suitable for this kind of trees: a tree kernel derived from the well-known parse tree kernel, the set tree kernel that allows permutations of children, the string tree kernel being an extension of the so-called partial tree kernel, and the soft tree kernel, which is based on the set tree kernel and takes into a account a "fuzzy" comparison of child positions. We present first results on an artificial data set, a corpus of newspaper articles, for which we want to determine the type (genre) of an article based on its structure alone, and the well-known SUSANNE corpus.