Practical Word-Sense Disambiguation Using Co-occurring Concept Codes

Authors:
Youjin Chung;Jong-Hyeok Lee
Affiliations:
Div. of Electrical and Computer Engineering, POSTECH and Advanced Information Technology Research Center (AITre), Pohang, Republic of Korea 790-784;Div. of Electrical and Computer Engineering, POSTECH and Advanced Information Technology Research Center (AITre), Pohang, Republic of Korea 790-784
Venue:
Machine Translation
Year:
2005

Citing 15
Cited 0

Using multiple knowledge sources for word sense discrimination

Computational Linguistics
C4.5: programs for machine learning

C4.5: programs for machine learning
Word sense disambiguation using a second language monolingual corpus

Computational Linguistics
Making large-scale support vector machine learning practical

Advances in kernel methods
Retrieving collocations from text: Xtract

Computational Linguistics - Special issue on using large corpora: I
Introduction to the special issue on word sense disambiguation: the state of the art

Computational Linguistics - Special issue on word sense disambiguation
Using corpus statistics and WordNet relations for sense identification

Computational Linguistics - Special issue on word sense disambiguation
The role of domain information in Word Sense Disambiguation

Natural Language Engineering
Using syntactic dependency as local context to resolve word sense ambiguity

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Statistical sense disambiguation with relatively small corpora using dictionary definitions

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Word-sense disambiguation using decomposable models

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Integrating multiple knowledge sources to disambiguate word sense: an exemplar-based approach

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Word-sense disambiguation using statistical models of Roget's categories trained on large corpora

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Word sense disambiguation using Conceptual Density

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
One sense per collocation

HLT '93 Proceedings of the workshop on Human Language Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most previous corpus-based approaches to the resolution of word-sense ambiguity have collected lexical information from the context of the word to be disambiguated. However, they suffer from the problem of data sparseness. To address this problem, this paper proposes a disambiguation method using co-occurring concept codes (CCCs). The use of concept-code features and concept-code generalization effectively alleviate the data sparseness problem and also reduce the number of features to a practical size without any loss in system performance. We prove the effectiveness of the CCC features and the concept-code generalization by experimental evaluations. The proposed disambiguation method is applied to a Korean-to-Japanese MT system that experimented with various machine-learning techniques. In a lexical sample evaluation, our CCC-based method achieved a precision of 82.00%, with an 11.83% improvement over the baseline. Also, it achieved a precision of 83.51% in an experiment on real text, which shows that our proposed method is very useful for practical MT systems.