A semantics-enhanced language model for unsupervised word sense disambiguation

Authors:
Shou-De Lin;Karin Verspoor
Affiliations:
National Taiwan University;Los Alamos National Laboratory
Venue:
CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
Year:
2008

Citing 13
Cited 1

Class-based n-gram models of natural language

Computational Linguistics
Holism, Conceptual-Role Semantics, and Syntactic Semantics

Minds and Machines
Estimating Word Translation Probabilities from Unrelated Monolingual Corpora Using the EM Algorithm

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Parameter optimization for machine-learning of word sense disambiguation

Natural Language Engineering
A practical part-of-speech tagger

ANLC '92 Proceedings of the third conference on Applied natural language processing
Structural Semantic Interconnections: A Knowledge-Based Approach to Word Sense Disambiguation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Finding predominant word senses in untagged text

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Ensemble methods for unsupervised WSD

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Unsupervised large-vocabulary word sense disambiguation with graph-based algorithms for sequence data labeling

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Unsupervised analysis for decipherment problems

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Extended gloss overlaps as a measure of semantic relatedness

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Improving word sense disambiguation in lexical chaining

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Discovering the linear writing order of a two-dimensional ancient hieroglyphic script

Artificial Intelligence

Using semantic techniques to access web data

Information Systems

Quantified Score

Hi-index	0.01

Visualization

Abstract

An N-gram language model aims at capturing statistical word order dependency information from corpora. Although the concept of language models has been applied extensively to handle a variety of NLP problems with reasonable success, the standard model does not incorporate semantic information, and consequently limits its applicability to semantic problems such as word sense disambiguation. We propose a framework that integrates semantic information into the language model schema, allowing a system to exploit both syntactic and semantic information to address NLP problems. Furthermore, acknowledging the limited availability of semantically annotated data, we discuss how the proposed model can be learned without annotated training examples. Finally, we report on a case study showing how the semantics-enhanced language model can be applied to unsupervised word sense disambiguation with promising results.