Word sequence kernels

Authors:
Nicola Cancedda;Eric Gaussier;Cyril Goutte;Jean Michel Renders
Affiliations:
Xerox Research Centre Europe, 6, chemin de Maupertuis, 38240 Meylan, France;Xerox Research Centre Europe, 6, chemin de Maupertuis, 38240 Meylan, France;Xerox Research Centre Europe, 6, chemin de Maupertuis, 38240 Meylan, France;Xerox Research Centre Europe, 6, chemin de Maupertuis, 38240 Meylan, France
Venue:
The Journal of Machine Learning Research
Year:
2003

Citing 13
Cited 66

A training algorithm for optimal margin classifiers

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
The nature of statistical learning theory

The nature of statistical learning theory
Support-Vector Networks

Machine Learning
Experiments in multilingual information retrieval using the SPIDER system

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Cross-linguistic information retrieval workshop

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Generalized vector spaces model in information retrieval

SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Making large-scale support vector machine learning practical

Advances in kernel methods
A re-examination of text categorization methods

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Natural language information retrieval: progress report

Information Processing and Management: an International Journal - The sixth text REtrieval conference (TREC-6)
AI Game Programming Wisdom

AI Game Programming Wisdom
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Text classification using string kernels

The Journal of Machine Learning Research

Two-stage statistical language models for text database selection

Information Retrieval
Practical solutions to the problem of diagonal dominance in kernel document clustering

ICML '06 Proceedings of the 23rd international conference on Machine learning
Extracting key-substring-group features for text classification

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Convolution kernels with feature selection for natural language processing tasks

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Dependency-based sentence alignment for multiple document summarization

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Kernel-based approach for automatic evaluation of natural language generation technologies: application to automatic summarization

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Sequence-similarity kernels for SVMs to detect anomalies in system calls

Neurocomputing
Kernel-Based Learning of Hierarchical Multilabel Classification Models

The Journal of Machine Learning Research
Weighted kernel model for text categorization

AusDM '06 Proceedings of the fifth Australasian conference on Data mining and analystics - Volume 61
Supervised automatic evaluation for summarization with voted regression model

Information Processing and Management: an International Journal
Mining relational data from text: From strictly supervised to weakly supervised learning

Information Systems
Efficient computations of gapped string kernels based on suffix kernel

Neurocomputing
Support for seamless data exchanges between web services through information mapping analysis using kernel methods

Expert Systems with Applications: An International Journal
Kernel methods, syntax and semantics for relational text categorization

Proceedings of the 17th ACM conference on Information and knowledge management
Matrix representations, linear transformations, and kernels for disambiguation in natural language

Machine Learning
A Graph-Based Approach for Sentiment Sentence Extraction

New Frontiers in Applied Data Mining
Factored sequence kernels

Neurocomputing
Psychiatric document retrieval using a discourse-aware model

Artificial Intelligence
Designing Technology as an Embedded Resource for Troubleshooting

Computer Supported Cooperative Work
Discovering subword associations in strings in time linear in the output size

Journal of Discrete Algorithms
A Quantitative Method for RSS Based Applications

Proceedings of the 2008 conference on Applications of Data Mining in E-Business and Finance
Fourier Domain Scoring with Document Structure Consideration

Proceedings of the 2006 conference on Advances in Intelligent IT: Active Media Technology 2006
Locality kernels for sequential data and their applications to parse ranking

Applied Intelligence
On a Kernel Regression Approach to Machine Translation

IbPRIA '09 Proceedings of the 4th Iberian Conference on Pattern Recognition and Image Analysis
Efficient linearization of tree kernel functions

CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Coreference systems based on kernels methods

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Using lexical and relational similarity to classify semantic relations

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Short text authorship attribution via sequence kernels, Markov chains and author unmasking: an investigation

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Co-occurrence contexts for noun compound interpretation

MWE '07 Proceedings of the Workshop on a Broader Perspective on Multiword Expressions
A dependency-based word subsequence kernel

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Syntactic Structural Kernels for Natural Language Interfaces to Databases

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Human activity encoding and recognition using low-level visual features

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Kernel methods for minimally supervised wsd

Computational Linguistics
Reverse engineering of tree kernel feature spaces

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Re-ranking models based-on small training data for spoken language understanding

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Convolution kernels on constituent, dependency and sequential structures for relation extraction

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Classifier fusion for SVM-based multimedia semantic indexing

ECIR'07 Proceedings of the 29th European conference on IR research
A kernel method for measuring structural similarity between XML documents

IEA/AIE'07 Proceedings of the 20th international conference on Industrial, engineering, and other applications of applied intelligent systems
String extension learning

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
On reverse feature engineering of syntactic tree kernels

CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Two-tier similarity model for story link detection

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
On languages piecewise testable in the strict sense

MOL'07/09 Proceedings of the 10th and 11th Biennial conference on The mathematics of language
Similarity word-sequence kernels for sentence clustering

SSPR&SPR'10 Proceedings of the 2010 joint IAPR international conference on Structural, syntactic, and statistical pattern recognition
Semi-supervised abstraction-augmented string kernel for multi-level bio-relation extraction

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
Using local alignments for relation recognition

Journal of Artificial Intelligence Research
A sum-over-paths extension of edit distances accounting for all sequence alignments

Pattern Recognition
An effective approach for searching closest sentence translations from the web

DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications: Part II
Improving graph-based random walks for complex question answering using syntactic, shallow semantic and extended string subsequence kernels

Information Processing and Management: an International Journal
Linguistic kernels for answer re-ranking in question answering systems

Information Processing and Management: an International Journal
Fast support vector machines for structural Kernels

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
A tree kernel-based method for protein-protein interaction mining from biomedical literature

KDLL'06 Proceedings of the 2006 international conference on Knowledge Discovery in Life Science Literature
Instance pruning by filtering uninformative words: an information extraction case study

CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Classification of RSS-Formatted documents using full text similarity measures

ICWE'05 Proceedings of the 5th international conference on Web Engineering
Query-focused multi-document summarization: Automatic data annotations and supervised learning approaches

Natural Language Engineering
Web textual documents scoring based on discrete transforms with fuzzy weighting

AWIC'05 Proceedings of the Third international conference on Advances in Web Intelligence
Using syntactic and semantic structural kernels for classifying definition questions in Jeopardy!

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Structured lexical similarity via convolution kernels on dependency trees

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A fast bit-parallel algorithm for gapped string kernels

ICONIP'06 Proceedings of the 13 international conference on Neural Information Processing - Volume Part I
Combining classification with clustering for web person disambiguation

Proceedings of the 21st international conference companion on World Wide Web
Unsupervised modeling of dialog acts in asynchronous conversations

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Structural relationships for large-scale learning of answer re-ranking

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Verb classification using distributional similarity in syntactic and semantic structures

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Modeling topic dependencies in hierarchical text categorization

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
A weakly supervised model for sentence-level semantic orientation analysis with multiple experts

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Wikipedia-based WSD for multilingual frame annotation

Artificial Intelligence
Evaluating the impact of syntax and semantics on emotion recognition from text

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

We address the problem of categorising documents using kernel-based methods such as Support Vector Machines. Since the work of Joachims (1998), there is ample experimental evidence that SVM using the standard word frequencies as features yield state-of-the-art performance on a number of benchmark problems. Recently, Lodhi et al. (2002) proposed the use of string kernels, a novel way of computing document similarity based of matching non-consecutive subsequences of characters. In this article, we propose the use of this technique with sequences of words rather than characters. This approach has several advantages, in particular it is more efficient computationally and it ties in closely with standard linguistic pre-processing techniques. We present some extensions to sequence kernels dealing with symbol-dependent and match-dependent decay factors, and present empirical evaluations of these extensions on the Reuters-21578 datasets.