A comparison of word similarity measures for noun compound disambiguation

Authors:
Paul Nulty;Fintan Costello
Affiliations:
School of Computer Science and Informatics, University College Dublin, Dublin 4, Ireland;School of Computer Science and Informatics, University College Dublin, Dublin 4, Ireland
Venue:
AICS'09 Proceedings of the 20th Irish conference on Artificial intelligence and cognitive science
Year:
2009

Citing 8
Cited 0

An Information-Theoretic Definition of Similarity

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Automatic acquisition of hyponyms from large text corpora

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Similarity of Semantic Relations

Computational Linguistics
NLTK: the natural language toolkit

ACLdemo '04 Proceedings of the ACL 2004 on Interactive poster and demonstration sessions
Annotating and learning compound noun semantics

ACL '07 Proceedings of the 45th Annual Meeting of the ACL: Student Research Workshop
Using lexical and relational similarity to classify semantic relations

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
WordNet::SenseRelate::AllWords: a broad coverage word sense tagger that maximizes semantic relatedness

NAACL-Demonstrations '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Demonstration Session
Automatic interpretation of noun compounds using wordnet similarity

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing

Quantified Score

Hi-index	0.03

Visualization

Abstract

Noun compounds occur frequently in many languages, and the problem of semantic disambiguation of these phrases has many potential applications in natural language processing and other areas. One very common approach to this problem is to define a set of semantic relations which capture the interaction between the modifier and the head noun, and then attempt to assign one of these semantic relations to each compound. For example, the compound phrase flu virus could be assigned the semantic relation causal (the virus causes the flu); the relation for desert wind could be location (the wind is located in the desert). In this paper we investigate methods for learning the correct semantic relation for a given noun compound by comparing the new compound to a training set of hand-tagged instances, using the similarity of the words in each compound. The main contribution of this paper is to directly compare distributional and knowledge-based word similarity measures for this task, using various datasets and corpora. We find that the knowledge based system provides a much better performance when adequate training data is available.