Robust German noun chunking with a probabilistic context-free grammar

  • Authors:
  • Helmut Schmid;Sabine Schulte im Walde

  • Affiliations:
  • Universität Stuttgart, Stuttgart, Germany;Universität Stuttgart, Stuttgart, Germany

  • Venue:
  • COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a noun chunker for German which is based on a head-lexicalised probabilistic context-free grammar. A manually developed grammar was semi-automatically extended with robustness rules in order to allow parsing of unrestricted text. The model parameters were learned from unlabelled training data by a probabilistic context-free parser. For extracting noun chunks, the parser generates all possible noun chunk analyses, scores them with a novel algorithm which maximizes the best chunk sequence criterion, and chooses the most probable chunk sequence. An evaluation of the chunker on 2,140 hand-annotated noun chunks yielded 92% recall and 93% precision.