Fast methods for kernel-based text analysis

Authors:
Taku Kudo;Yuji Matsumoto
Affiliations:
Nara Institute of Science and Technology;Nara Institute of Science and Technology
Venue:
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Year:
2003

Citing 11
Cited 69

An Efficient Digital Search Algorithm by Using a Double-Array Structure

IEEE Transactions on Software Engineering
The nature of statistical learning theory

The nature of statistical learning theory
Kernels for Semi-Structured Data

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth

Proceedings of the 17th International Conference on Data Engineering
Efficiently mining frequent trees in a forest

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Text classification using string kernels

The Journal of Machine Learning Research
Efficient support vector classifiers for named entity recognition

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Revision learning and its application to part-of-speech tagging

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Chunking with support vector machines

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Japanese dependency structure analysis based on support vector machines

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Japanese dependency analysis using cascaded chunking

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20

Use of morphological analysis in protein name recognition

Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Guest Editors Introduction: Machine Learning in Speech and Language Technologies

Machine Learning
Fast transpose methods for kernel learning on sparse data

ICML '06 Proceedings of the 23rd international conference on Machine learning
Convolution kernels with feature selection for natural language processing tasks

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
A kernel PCA method for superior word sense disambiguation

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Arabic tokenization, part-of-speech tagging and morphological disambiguation in one fell swoop

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Noun phrase chunking in Hebrew: influence of lexical and morphological features

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Linear-time dependency analysis for Japanese

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Semi-supervised training of a kernel PCA-based model for word sense disambiguation

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Speeding up training with tree kernels for node relation labeling

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Speeding up full syntactic parsing by leveraging partial parsing decisions

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
The framework of the speech communication system with emotion processing

AIKED'07 Proceedings of the 6th Conference on 6th WSEAS Int. Conf. on Artificial Intelligence, Knowledge Engineering and Data Bases - Volume 6
Extracting related named entities from blogosphere for event mining

Proceedings of the 2nd international conference on Ubiquitous information management and communication
Tree kernels for semantic role labeling

Computational Linguistics
Kernel methods, syntax and semantics for relational text categorization

Proceedings of the 17th ACM conference on Information and knowledge management
Search-based structured prediction

Machine Learning
splitSVM: fast, space-efficient, non-heuristic, polynomial kernel computation for NLP applications

HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
An approximate approach for training polynomial kernel SVMs in linear time

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Askus: Amplifying Mobile Actions

Pervasive '09 Proceedings of the 7th International Conference on Pervasive Computing
A fast boosting-based learner for feature-rich tagging and chunking

CoNLL '08 Proceedings of the Twelfth Conference on Computational Natural Language Learning
Efficient linearization of tree kernel functions

CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Retrieving bilingual verb-noun collocations by integrating cross-language category hierarchies

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Syntactic and semantic kernels for short text pair categorization

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Syntactic kernels for natural language learning: the semantic role labeling case

NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Arabic diacritization through full morphological tagging

NAACL-Short '07 Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
Boosting a Semantic Search Engine by Named Entities

ISMIS '09 Proceedings of the 18th International Symposium on Foundations of Intelligent Systems
Syntactic Structural Kernels for Natural Language Interfaces to Databases

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
LTH: semantic structure extraction using nonprojective dependency trees

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
A hybrid approach for building Arabic diacritizer

Semitic '09 Proceedings of the EACL 2009 Workshop on Computational Approaches to Semitic Languages
Bayes risk-based dialogue management for document retrieval system with speech interface

Speech Communication
Discriminative Phrase-Based Models for Arabic Machine Translation

ACM Transactions on Asian Language Information Processing (TALIP)
Reverse engineering of tree kernel feature spaces

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Re-ranking models based-on small training data for spoken language understanding

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Polynomial to linear: efficient classification with conjunctive features

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Applying spelling error correction techniques for improving semantic role labelling

CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Classifying Japanese polysemous verbs based on fuzzy C-means clustering

TextGraphs-4 Proceedings of the 2009 Workshop on Graph-based Methods for Natural Language Processing
Using topic themes for multi-document summarization

ACM Transactions on Information Systems (TOIS)
Video scene retrieval using online video annotation

JSAI'07 Proceedings of the 2007 conference on New frontiers in artificial intelligence
Training and Testing Low-degree Polynomial Data Mappings via Linear SVM

The Journal of Machine Learning Research
On reverse feature engineering of syntactic tree kernels

CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Kernel slicing: scalable online training with conjunctive features

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Large-scale support vector learning with structural kernels

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
LivingKnowledge: kernel methods for relational learning and semantic modeling

ISoLA'10 Proceedings of the 4th international conference on Leveraging applications of formal methods, verification, and validation - Volume Part II
Polysemous verb classification using subcategorization acquisition and graph-based clustering

LTC'09 Proceedings of the 4th conference on Human language technology: challenges for computer science and linguistics
Enhancing opinion extraction by automatically annotated lexical resources

LTC'09 Proceedings of the 4th conference on Human language technology: challenges for computer science and linguistics
Using deep morphology to improve automatic error detection in Arabic handwriting recognition

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Grammatical error correction with alternating structure optimization

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Linguistic kernels for answer re-ranking in question answering systems

Information Processing and Management: an International Journal
Fast support vector machines for structural Kernels

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Efficient convolution kernels for dependency and constituent syntactic trees

ECML'06 Proceedings of the 17th European conference on Machine Learning
Using syntactic and semantic structural kernels for classifying definition questions in Jeopardy!

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Structured lexical similarity via convolution kernels on dependency trees

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Topic tracking based on linguistic features

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Semantic mapping between natural language questions and SQL queries via syntactic pairing

NLDB'09 Proceedings of the 14th international conference on Applications of Natural Language to Information Systems
Design and compilation of syntactically tagged corpus of japanese statutory sentences

JSAI-isAI'10 Proceedings of the 2010 international conference on New Frontiers in Artificial Intelligence
A multiclass classification framework for document categorization

DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
NUS at the HOO 2011 pilot shared task

ENLG '11 Proceedings of the 13th European Workshop on Natural Language Generation
ZamAn and raqm: extracting temporal and numerical expressions in arabic

AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
Structural relationships for large-scale learning of answer re-ranking

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Identifying broken plurals, irregular gender, and rationality in Arabic text

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Full machine translation for factoid question answering

EACL 2012 Proceedings of the Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra)
NUS at the HOO 2012 shared task

Proceedings of the Seventh Workshop on Building Educational Applications Using NLP
Verb classification using distributional similarity in syntactic and semantic structures

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Modeling topic dependencies in hierarchical text categorization

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
A beam-search decoder for grammatical error correction

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Statistical modality tagging from rule-based annotations and crowdsourcing

ExProM '12 Proceedings of the Workshop on Extra-Propositional Aspects of Meaning in Computational Linguistics
Finite rank kernels for multi-task learning

Advances in Computational Mathematics
Searching emotional scenes in TV programs based on twitter emotion analysis

OCSC'13 Proceedings of the 5th international conference on Online Communities and Social Computing
Fast linearization of tree kernels over large-scale data

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Kernel-based learning (e.g., Support Vector Machines) has been successfully applied to many hard problems in Natural Language Processing (NLP). In NLP, although feature combinations are crucial to improving performance, they are heuristically selected. Kernel methods change this situation. The merit of the kernel methods is that effective feature combination is implicitly expanded without loss of generality and increasing the computational costs. Kernel-based text analysis shows an excellent performance in terms in accuracy; however, these methods are usually too slow to apply to large-scale text analysis. In this paper, we extend a Basket Mining algorithm to convert a kernel-based classifier into a simple and fast linear classifier. Experimental results on English BaseNP Chunking, Japanese Word Segmentation and Japanese Dependency Parsing show that our new classifiers are about 30 to 300 times faster than the standard kernel-based classifiers.