Conceptual information-based sense disambiguation

Authors:
You-Jin Chung;Kyonghi Moon;Jong-Hyeok Lee
Affiliations:
Div. of Electrical and Computer Engineering, POSTECH and AITrc, Pohang, R. of Korea;Div. of Computer and Information Engineering, Silla University, Busan, R. of Korea;Div. of Electrical and Computer Engineering, POSTECH and AITrc, Pohang, R. of Korea
Venue:
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Year:
2004

Citing 9
Cited 0

Using multiple knowledge sources for word sense discrimination

Computational Linguistics
Retrieving collocations from text: Xtract

Computational Linguistics - Special issue on using large corpora: I
Introduction to the special issue on word sense disambiguation: the state of the art

Computational Linguistics - Special issue on word sense disambiguation
Statistical sense disambiguation with relatively small corpora using dictionary definitions

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Integrating multiple knowledge sources to disambiguate word sense: an exemplar-based approach

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Word-sense disambiguation using statistical models of Roget's categories trained on large corpora

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
One sense per collocation

HLT '93 Proceedings of the workshop on Human Language Technology
Modeling consensus: classifier combination for word sense disambiguation

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Using domain information for word sense disambiguation

SENSEVAL '01 The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most previous corpus-based approaches to word-sense disambiguation (WSD) collect salient words from the context of a target word. However, they suffer from the problem of data sparseness. To overcome the problem, this paper proposes a concept-based WSD method that uses an automatically generated sense-tagged corpus. Grammatical similarities between Korean and Japanese enable the construction of a sense-tagged Korean corpus through an existing high-quality Japanese-to-Korean machine translation system. The sense-tagged corpus can serve as a knowledge source to extract useful clues for word sense disambiguation, such as concept co-occurrence information. In an evaluation, a weighted voting model achieved the best average precision of 77.22%, with an improvement over the baseline by 14.47%, which shows that our proposed method is very promising for practical MT systems.