A measure of syntactic flexibility for automatically identifying multiword expressions in corpora

  • Authors:
  • Colin Bannard

  • Affiliations:
  • Max Planck Institute for Evolutionary Anthropology, Deutscher Platz, Leipzig

  • Venue:
  • MWE '07 Proceedings of the Workshop on a Broader Perspective on Multiword Expressions
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Natural languages contain many multi-word sequences that do not display the variety of syntactic processes we would expect given their phrase type, and consequently must be included in the lexicon as multiword units. This paper describes a method for identifying such items in corpora, focussing on English verb-noun combinations. In an evaluation using a set of dictionary-published MWEs we show that our method achieves greater accuracy than existing MWE extraction methods based on lexical association.