ParaSense or how to use parallel corpora for word sense disambiguation

Authors:
Els Lefever;Véronique Hoste;Martine De Cock
Affiliations:
University College Ghent, Groot-Brittanniëlaan, Gent, Belgium and Ghent University, Krijgslaan, Gent, Belgium;University College Ghent, Groot-Brittanniëlaan, Gent, Belgium and Ghent University, Krijgslaan, Gent, Belgium and Ghent University, Blandijnberg, Gent, Belgium;Ghent University, Krijgslaan, Gent, Belgium
Venue:
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Year:
2011

Citing 16
Cited 6

Word sense disambiguation using a second language monolingual corpus

Computational Linguistics
EuroWordNet: a multilingual database with lexical semantic networks

EuroWordNet: a multilingual database with lexical semantic networks
A systematic comparison of various statistical alignment models

Computational Linguistics
A program for aligning sentences in bilingual corpora

Computational Linguistics - Special issue on using large corpora: I
Parameter optimization for machine-learning of word sense disambiguation

Natural Language Engineering
An unsupervised method for word sense tagging using parallel corpora

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Exploiting parallel texts for word sense disambiguation: an empirical study

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Sense discrimination with parallel corpora

WSD '02 Proceedings of the ACL-02 workshop on Word sense disambiguation: recent successes and future directions - Volume 8
Memory-Based Language Processing (Studies in Natural Language Processing)

Memory-Based Language Processing (Studies in Natural Language Processing)
Word Sense Disambiguation: Algorithms and Applications (Text, Speech and Language Technology)

Word Sense Disambiguation: Algorithms and Applications (Text, Speech and Language Technology)
Fine-grained word sense disambiguation based on parallel corpora, word alignment, word clustering and aligned wordnets

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Scaling up word sense disambiguation via parallel texts

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
SemEval-2007 task 10: English lexical substitution task

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
SemEval-2010 task 3: Cross-lingual word sense disambiguation

SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation
COLEUR and COLSLM: A WSD approach to multilingual lexical substitution, tasks 2 and 3 SemEval 2010

SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation
UvT-WSD1: A cross-lingual word sense disambiguation system

SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation

A quick tour of word sense disambiguation, induction and related approaches

SOFSEM'12 Proceedings of the 38th international conference on Current Trends in Theory and Practice of Computer Science
Expectations of word sense in parallel corpora

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Multilingual WSD with just a few lines of code: the BabelNet API

ACL '12 Proceedings of the ACL 2012 System Demonstrations
Joining forces pays off: multilingual joint word sense disambiguation

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network

Artificial Intelligence
From input to output: the potential of parallel corpora for CALL

Language Resources and Evaluation

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a set of exploratory experiments for a multilingual classification-based approach to Word Sense Disambiguation. Instead of using a predefined monolingual sense-inventory such as WordNet, we use a language-independent framework where the word senses are derived automatically from word alignments on a parallel corpus. We built five classifiers with English as an input language and translations in the five supported languages (viz. French, Dutch, Italian, Spanish and German) as classification output. The feature vectors incorporate both the more traditional local context features, as well as binary bag-of-words features that are extracted from the aligned translations. Our results show that the ParaSense multilingual WSD system shows very competitive results compared to the best systems that were evaluated on the SemEval-2010 Cross-Lingual Word Sense Disambiguation task for all five target languages.