Combining resources for MWE-token classification

  • Authors:
  • Richard Fothergill;Timothy Baldwin

  • Affiliations:
  • The University of Melbourne VIC Australia;The University of Melbourne VIC Australia

  • Venue:
  • SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We study the task of automatically disambiguating word combinations such as jump the gun which are ambiguous between a literal and MWE interpretation, focusing on the utility of type-level features from an MWE lexicon for the disambiguation task. To this end we combine gold-standard idiomaticity of tokens in the OpenMWE corpus with MWE-type-level information drawn from the recently-published JDMWE lexicon. We find that constituent modifiability in an MWE-type is more predictive of the idiomaticity of its tokens than other constituent characteristics such as semantic class or part of speech.