Matrix representations, linear transformations, and kernels for disambiguation in natural language

Authors:
Tapio Pahikkala;Sampo Pyysalo;Jorma Boberg;Jouni Järvinen;Tapio Salakoski
Affiliations:
University of Turku and Turku Centre for Computer Science (TUCS), Turku, Finland 20014;University of Turku and Turku Centre for Computer Science (TUCS), Turku, Finland 20014;University of Turku and Turku Centre for Computer Science (TUCS), Turku, Finland 20014;University of Turku and Turku Centre for Computer Science (TUCS), Turku, Finland 20014;University of Turku and Turku Centre for Computer Science (TUCS), Turku, Finland 20014
Venue:
Machine Learning
Year:
2009

Citing 27
Cited 3

Generalized vector spaces model in information retrieval

SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Prior knowledge in support vector kernels

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
A Winnow-Based Approach to Context-Sensitive Spelling Correction

Machine Learning - Special issue on natural language learning
Matrix analysis and applied linear algebra

Matrix analysis and applied linear algebra
Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition

Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms

Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Text Categorization with Support Vector Machines. How to Represent Texts in Input Space?

Machine Learning
Latent Semantic Kernels

Journal of Intelligent Information Systems
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
The Case against Accuracy Estimation for Comparing Induction Algorithms

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Support Vector Machines Based on a Semantic Kernel for Text Categorization

IJCNN '00 Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks (IJCNN'00)-Volume 5 - Volume 5
Everything old is new again: a fresh look at historical approaches in machine learning

Everything old is new again: a fresh look at historical approaches in machine learning
Text classification using string kernels

The Journal of Machine Learning Research
Word sequence kernels

The Journal of Machine Learning Research
Kernel Methods for Pattern Analysis

Kernel Methods for Pattern Analysis
Unsupervised word sense disambiguation rivaling supervised methods

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
New Techniques for Disambiguation in Natural Language and Their Application to Biological Text

The Journal of Machine Learning Research
Kernels and Distances for Structured Data

Machine Learning
One sense per collocation

HLT '93 Proceedings of the workshop on Human Language Technology
Introduction to the CoNLL-2003 shared task: language-independent named entity recognition

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Domain kernels for word sense disambiguation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Binet-Cauchy Kernels on Dynamical Systems and its Application to the Analysis of Dynamic Scenes

International Journal of Computer Vision
AUC: a statistically consistent and more discriminating measure than accuracy

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Revising the wordnet domains hierarchy: semantics, coverage and balancing

MLR '04 Proceedings of the Workshop on Multilingual Linguistic Ressources
Introduction to the CoNLL-2005 shared task: semantic role labeling

CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Learning with feature description logics

ILP'02 Proceedings of the 12th international conference on Inductive logic programming
Incorporating external information in bayesian classifiers via linear feature transformations

FinTAL'06 Proceedings of the 5th international conference on Advances in Natural Language Processing

Efficient hold-out for subset of regressors

ICANNGA'09 Proceedings of the 9th international conference on Adaptive and natural computing algorithms
An experimental comparison of cross-validation techniques for estimating the area under the ROC curve

Computational Statistics & Data Analysis
Supervised word sense disambiguation using semantic diffusion kernel

Engineering Applications of Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the application of machine learning methods with natural language inputs, the words and their positions in the input text are some of the most important features. In this article, we introduce a framework based on a word-position matrix representation of text, linear feature transformations of the word-position matrices, and kernel functions constructed from the transformations. We consider two categories of transformations, one based on word similarities and the second on their positions, which can be applied simultaneously in the framework in an elegant way. We show how word and positional similarities obtained by applying previously proposed techniques, such as latent semantic analysis, can be incorporated as transformations in the framework. We also introduce novel ways to determine word and positional similarities. We further present efficient algorithms for computing kernel functions incorporating the transformations on the word-position matrices, and, more importantly, introduce a highly efficient method for prediction. The framework is particularly suitable to natural language disambiguation tasks where the aim is to select for a single word a particular property from a set of candidates based on the context of the word. We demonstrate the applicability of the framework to this type of tasks using context-sensitive spelling error correction on the Reuters News corpus as a model problem.