Relative compositionality of multi-word expressions: a study of verb-noun (v-n) collocations

  • Authors:
  • Sriram Venkatapathy;Aravind K. Joshi

  • Affiliations:
  • Language Technologies Research Center, International Institute of Information Technology – Hyderabad, Hyderabad, India;Department of Computer and Information Science and Institute of Research in Cognitive Science, University of Pennsylvania, Philadelphia, PA

  • Venue:
  • IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
  • Year:
  • 2005

Quantified Score

Hi-index 0.01

Visualization

Abstract

Recognition of Multi-word Expressions (MWEs) and their relative compositionality are crucial to Natural Language Processing. Various statistical techniques have been proposed to recognize MWEs. In this paper, we integrate all the existing statistical features and investigate a range of classifiers for their suitability for recognizing the non-compositional Verb-Noun (V-N) collocations. In the task of ranking the V-N collocations based on their relative compositionality, we show that the correlation between the ranks computed by the classifier and human ranking is significantly better than the correlation between ranking of individual features and human ranking. We also show that the properties ‘Distributed frequency of object' (as defined in [27] ) and ‘Nearest Mutual Information' (as adapted from [18]) contribute greatly to the recognition of the non-compositional MWEs of the V-N type and to the ranking of the V-N collocations based on their relative compositionality.