The nature of statistical learning theory
The nature of statistical learning theory
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Fast algorithms for sorting and searching strings
SODA '97 Proceedings of the eighth annual ACM-SIAM symposium on Discrete algorithms
Kernels for Semi-Structured Data
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Text classification using string kernels
The Journal of Machine Learning Research
State of the art of graph-based data mining
ACM SIGKDD Explorations Newsletter
Kernel Methods for Pattern Analysis
Kernel Methods for Pattern Analysis
Fast and space efficient string kernels using suffix arrays
ICML '06 Proceedings of the 23rd international conference on Machine learning
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Neurocomputing
Hi-index | 0.00 |
Kernel method is one of the promising approaches to learning with tree-structured data, and various efficient tree kernels have been proposed to capture informative structures in trees. In this paper, we propose a new tree kernel function based on "subpath sets" to capture vertical structures in rooted unordered trees, since such tree-structures are often used to code hierarchical information in data. We also propose a simple and efficient algorithm for computing the kernel by extending the multikey quicksort algorithm used for sorting strings. The time complexity of the algorithm is O((|T1|+|T2|)log(|T1|+|T2|)) time on average, and the space complexity is O(|T1| + |T2|), where |T1| and |T2| are the numbers of nodes in two trees T1 and T2. We apply the proposed kernel to two supervised classification tasks, XML classification in web mining and glycan classification in bioinformatics. The experimental results show that the predictive performance of the proposed kernel is competitive with that of the existing efficient tree kernel for unordered trees proposed by Vishwanathan et al. [1], and is also empirically faster than the existing kernel.