Integrating Domain and Paradigmatic Similarity for Unsupervised Sense Tagging

Authors:
Roberto Basili;Marco Cammisa;Alfio Massimiliano Gliozzo
Affiliations:
University of Rome Tor Vergata, Italy, email: {basili,cammisa}@info.uniroma2.it;University of Rome Tor Vergata, Italy, email: {basili,cammisa}@info.uniroma2.it;ITC-irst, Trento, Italy, email: gliozzo@itc.it
Venue:
Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
Year:
2006

Citing 4
Cited 1

The role of domain information in Word Sense Disambiguation

Natural Language Engineering
Word sense disambiguation using Conceptual Density

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Finding predominant word senses in untagged text

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Domain kernels for word sense disambiguation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics

From predicting predominant senses to local context for word sense disambiguation

STEP '08 Proceedings of the 2008 Conference on Semantics in Text Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

An unsupervised methodology for Word Sense Disambiguation, called Dynamic Domain Sense Tagging, is presented. It relies on the convergence of two very well known unsupervised approaches (i.e. Domain Driven Disambiguation and Conceptual Density). For each target word a domain is dynamically modeled by expanding the its topical context, i.e. a set of words evoking the underlying/implict domain where the word is located. The estimation of the paradigmatic similarity within such a specific lexicon is assumed as a disambiguation model. The Conceptual Density measure is here used to account for paradigmatic associations, and the top scored senses of the target word are selected accordingly. Results confirm the impact of domain based representation in capturing useful paradigmatic generalizations, especially when small text fragments are available. In addition, the precision/recall tradeoff of the resulting method can be tuned in a meaningful way, allowing us to achieve impressively high precision scores in a purely unsupervised setting.