On some optimization heuristics for lesk-like WSD algorithms

Authors:
Alexander Gelbukh;Grigori Sidorov;Sang-Yong Han
Affiliations:
Natural Language and Text Processing Laboratory, Center for Computing Research National Polytechnic Institute, Mexico;Natural Language and Text Processing Laboratory, Center for Computing Research National Polytechnic Institute, Mexico;Department of Computer Science and Engineering, Chung-Ang University, Seoul, Korea
Venue:
NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
Year:
2005

Citing 3
Cited 3

Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone

SIGDOC '86 Proceedings of the 5th annual international conference on Systems documentation
Lexical disambiguation using simulated annealing

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 1
One sense per discourse

HLT '91 Proceedings of the workshop on Speech and Natural Language

A bilingual corpus of novels aligned at paragraph level

FinTAL'06 Proceedings of the 5th international conference on Advances in Natural Language Processing
Paragraph-Level alignment of an english-spanish parallel corpus of fiction texts using bilingual dictionaries

TSD'06 Proceedings of the 9th international conference on Text, Speech and Dialogue
Word sense disambiguation as a traveling salesman problem

Artificial Intelligence Review

Quantified Score

Hi-index	0.00

Visualization

Abstract

For most English words, dictionaries give various senses: e.g., “bank”can stand for a financial institution, shore, set, etc. Automatic selection of the sense intended in a given text has crucial importance in many applications of text processing, such as information retrieval or machine translation: e.g., “(my account in the) bank” is to be translated into Spanish as “(mi cuenta en el) banco” whereas “(on the) bank (of the lake)” as “(en la) orilla (del lago).” To choose the optimal combination of the intended senses of all words, Lesk suggested to consider the global coherence of the text, i.e., which we mean the average relatedness between the chosen senses for all words in the text. Due to high dimensionality of the search space, heuristics are to be used to find a near-optimal configuration. In this paper, we discuss several such heuristics that differ in terms of complexity and quality of the results. In particular, we introduce a dimensionality reduction algorithm that reduces the complexity of computationally expensive approaches such as genetic algorithms.