Automated multiword expression prediction for grammar engineering

  • Authors:
  • Yi Zhang;Valia Kordoni;Aline Villavicencio;Marco Idiart

  • Affiliations:
  • Saarland University, Saarbrücken, Germany;Saarland University, Saarbrücken, Germany;Federal University of Rio Grande do Sul, Porto Alegre - RS, Brazil;Federal University of Rio Grande do Sul, Porto Alegre - RS, Brazil

  • Venue:
  • MWE '06 Proceedings of the Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

However large a hand-crafted wide-coverage grammar is, there are always going to be words and constructions that are not included in it and are going to cause parse failure. Due to their heterogeneous and flexible nature, Multiword Expressions (MWEs) provide an endless source of parse failures. As the number of such expressions in a speaker's lexicon is equiparable to the number of single word units (Jackendoff, 1997), one major challenge for robust natural language processing systems is to be able to deal with MWEs. In this paper we propose to semi-automatically detect MWE candidates in texts using some error mining techniques and validating them using a combination of the World Wide Web as a corpus and some statistical measures. For the remaining candidates possible lexico-syntactic types are predicted, and they are subsequently added to the grammar as new lexical entries. This approach provides a significant increase in the coverage of these expressions.