Scaling up word sense disambiguation via parallel texts

Authors:
Yee Seng Chan;Hwee Tou Ng
Affiliations:
Department of Computer Science, National University of Singapore, Singapore;Department of Computer Science, National University of Singapore, Singapore
Venue:
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
Year:
2005

Citing 15
Cited 16

Integrating multiple knowledge sources to disambiguate word sense: an exemplar-based approach

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
An unsupervised method for word sense tagging using parallel corpora

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Exploiting parallel texts for word sense disambiguation: an empirical study

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Improved statistical alignment models

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Using a semantic concordance for sense identification

HLT '94 Proceedings of the workshop on Human Language Technology
Building a sense tagged corpus with open mind word expert

WSD '02 Proceedings of the ACL-02 workshop on Word sense disambiguation: recent successes and future directions - Volume 8
An empirical evaluation of knowledge sources and learning algorithms for word sense disambiguation

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Unsupervised sense disambiguation using bilingual probabilistic models

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Relieving the data acquisition bottleneck in word sense disambiguation

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
WordNet: similarity - measuring the relatedness of concepts

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
English lexical sample task description

SENSEVAL '01 The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems
English tasks: all-words and verb lexical sample

SENSEVAL '01 The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems
Improving WSD with multi-level view of context monitored by similarity measure

SENSEVAL '01 The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems
Classifier optimization and combination in the English all words task

SENSEVAL '01 The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems
Pattern learning and active feature selection for word sense disambiguation

SENSEVAL '01 The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems

Estimating class priors in domain adaptation for word sense disambiguation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Word sense disambiguation: A survey

ACM Computing Surveys (CSUR)
Word sense disambiguation using automatically translated sense examples

CrossLangInduction '06 Proceedings of the International Workshop on Cross-Language Knowledge Induction
SemEval-2007 task 11: English lexical sample task via English-Chinese parallel text

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
NUS-PT: exploiting parallel texts for word sense disambiguation in the English all-words tasks

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
On the use of automatically acquired examples for all-nouns word sense disambiguation

Journal of Artificial Intelligence Research
Word sense disambiguation for all words without hard labor

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
A Reexamination of MRD-Based Word Sense Disambiguation

ACM Transactions on Asian Language Information Processing (TALIP)
It makes sense: a wide-coverage word sense disambiguation system for free text

ACLDemos '10 Proceedings of the ACL 2010 System Demonstrations
Measuring historical word sense variation

Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
ParaSense or how to use parallel corpora for word sense disambiguation

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Enriching document representation via translation for improved monolingual information retrieval

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Correcting semantic collocation errors with L1-induced paraphrases

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Word sense disambiguation improves information retrieval

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Joining forces pays off: multilingual joint word sense disambiguation

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Word Sense Disambiguation by Combining Labeled Data Expansion and Semi-Supervised Learning Method

ACM Transactions on Asian Language Information Processing (TALIP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

A critical porblem faced by current supervised WSD systems is the lack or manually annotated training data. Tackling this data acquisition bottleneck is crucial, in order to build high-accuracy and wide-coverage WSD systems. In this paper, we show that the approach of automatically gathering training examples from parallel texts is scalable to a large set of nouns. We conducted evaluation on the nouns of SENSEVAL-2 English all-words task, using fine-grained sense scoring. Our evaluation shows that training on examples gathered from 680MB of parallel texts achieves accuracy comparable to the best system of SENSEVAL-2 English all-words task, and significantly outperforms the baseline of always choosing sense 1 of WordNet.