Investigations into the role of lexical semantics in word sense disambiguation

Authors:
Hoa Trang Dang;Martha S. Palmer
Affiliations:
University of Pennsylvania;University of Pennsylvania
Venue:
Investigations into the role of lexical semantics in word sense disambiguation
Year:
2004

Citing 0
Cited 7

An empirical study of the behavior of active learning for word sense disambiguation

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Aligning features with sense distinction dimensions

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
The choice of features for classification of verbs in biomedical texts

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Verb class discovery from rich syntactic data

CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
Bringing active learning to life

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Towards robust high performance word sense disambiguation of english verbs using rich linguistic features

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Classifying French verbs using French and English lexical resources

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

Verbs that can have more than one meaning pose problems for Natural Language Processing (NLP) applications. While homonyms (words with unrelated meanings) are fairly tractable, polysemous verbs with similar related meanings pose the greatest hurdle for automatic Word Sense Disambiguation (WSD). A major problem with WSD for verbs is that even humans disagree about what constitutes a different sense for a polysemous word. This thesis investigates verb lexical semantics and their computational representations, and how these can be used for automatic WSD. Our main contribution is in defining criteria by which humans make sense distinctions for verbs, and in translating these criteria into linguistically-motivated features that we use to build a state-of-the-art automatic WSD system. Our explicit criteria for sense distinctions allow humans to sense-tag data more consistently. Improved human performance on the WSD task enables improved system performance. We begin by examining the definition of verb polysemy implicit in Levin verb classes. We describe our work on VerbNet, a lexical resource in which different senses of a verb are defined by membership in different verb classes; the classes have distinctive syntactic frames and explicit semantic predicates that characterize the verb senses in that class. We then translate some of these lexical semantic characteristics into richer linguistic features used to build our automatic WSD system. The system performs competitively on the English verbs of Senseval-1 and Senseval-2 by combining information from syntax, lexical collocations, and semantic class constraints on verb arguments. Adding gold-standard predicate-argument information from PropBank further improves system performance. Because humans have difficulty making fine-grained sense distinctions, creation of manually sense-tagged corpora is time-consuming and expensive. We experiment with active learning to get additional training data for our system, but find that the quality of manually sense-tagged data is limited by an inconsistent or unclear sense inventory. We develop criteria for grouping senses and show that well-defined groupings of WordNet senses can improve both human inter-annotator agreement and system performance. The groupings fit into a hierarchy of WordNet senses that allow different NLP applications to use different granularities of sense distinctions.