Collocation extraction based on modifiability statistics

  • Authors:
  • Joachim Wermter;Udo Hahn

  • Affiliations:
  • Friedrich-Schiller-Universität Jena, Jena, Germany;Friedrich-Schiller-Universität Jena, Jena, Germany

  • Venue:
  • COLING '04 Proceedings of the 20th international conference on Computational Linguistics
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

We introduce a new, linguistically grounded measure of collocativity based on the property of limited modifiability and test it on German PP-verb combinations. We show that our measure not only significantly outperforms the standard lexical association measures typically employed for collocation extraction, but also yields a valuable by-product for the creation of collocation databases, viz. possible structural and lexical attributes. Our approach is language-, structure-, and domain-independent because it only requires some shallow syntactic analysis (e.g., a POS-tagger and a phrase chunker).