Fast linearization of tree kernels over large-scale data

Authors:
Aliaksei Severyn;Alessandro Moschitti
Affiliations:
University of Trento, DISI, Povo, TN, Italy;University of Trento, DISI, Povo, TN, Italy and Qatar Computing Research Institute, Doha, Qatar
Venue:
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Year:
2013

Citing 21
Cited 0

Making large-scale support vector machine learning practical

Advances in kernel methods
Automatic labeling of semantic roles

Computational Linguistics
PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth

Proceedings of the 17th International Conference on Data Engineering
Efficiently mining frequent trees in a forest

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
An introduction to variable and feature selection

The Journal of Machine Learning Research
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
A maximum-entropy-inspired parser

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
New ranking algorithms for parsing and tagging: kernels over discrete structures, and the voted perceptron

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Fast methods for kernel-based text analysis

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
The Proposition Bank: An Annotated Corpus of Semantic Roles

Computational Linguistics
Training linear SVMs in linear time

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Boosting-based parse reranking with subtree features

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Using string-kernels for learning semantic parsers

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Speeding up training with tree kernels for node relation labeling

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Training a Support Vector Machine in the Primal

Neural Computation
Optimized cutting plane algorithm for support vector machines

Proceedings of the 25th international conference on Machine learning
LIBLINEAR: A Library for Large Linear Classification

The Journal of Machine Learning Research
Efficient linearization of tree kernel functions

CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Introduction to the CoNLL-2005 shared task: semantic role labeling

CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Hash Kernels for Structured Data

The Journal of Machine Learning Research
Fast support vector machines for structural Kernels

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III

Quantified Score

Hi-index	0.00

Visualization

Abstract

Convolution tree kernels have been successfully applied to many language processing tasks for achieving state-of-the-art accuracy. Unfortunately, higher computational complexity of learning with kernels w.r.t. using explicit feature vectors makes them less attractive for large-scale data. In this paper, we study the latest approaches to solve such problems ranging from feature hashing to reverse kernel engineering and approximate cutting plane training with model compression. We derive a novel method that relies on reverse-kernel engineering together with an efficient kernel learning method. The approach gives the advantage of using tree kernels to automatically generate rich structured feature spaces and working in the linear space where learning and testing is fast. We experimented with training sets up to 4 million examples from Semantic Role Labeling. The results show that (i) the choice of correct structural features is essential and (ii) we can speed-up training from weeks to less than 20 minutes.