Multiword expressions in the wild?: the mwetoolkit comes in handy

  • Authors:
  • Carlos Ramisch;Aline Villavicencio;Christian Boitet

  • Affiliations:
  • University of Grenoble and Federal University of Rio Grande do Sul;Federal University of Rio Grande do Sul;University of Grenoble

  • Venue:
  • COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

The mwetoolkit is a tool for automatic extraction of Multiword Expressions (MWEs) from monolingual corpora. It both generates and validates MWE candidates. The generation is based on surface forms, while for the validation, a series of criteria for removing noise are provided, such as some (language independent) association measures. In this paper, we present the use of the mwetoolkit in a standard configuration, for extracting MWEs from a corpus of general-purpose English. The functionalities of the toolkit are discussed in terms of a set of selected examples, comparing it with related work on MWE extraction.