PCFG Learning by Nonterminal Partition Search

  • Authors:
  • Anja Belz

  • Affiliations:
  • -

  • Venue:
  • ICGI '02 Proceedings of the 6th International Colloquium on Grammatical Inference: Algorithms and Applications
  • Year:
  • 2002

Quantified Score

Hi-index 0.01

Visualization

Abstract

PCFG Learning by Partition Search is a general grammatical inference method for constructing, adapting and optimising PCFGS. Given a training corpus of examples from a language, a canonical grammar for the training corpus, and a parsing task, Partition Search PCFG Learning constructs a grammar that maximises performance on the parsing task and minimises grammar size. This paper describes Partition Search in detail, also providing theoretical background and a characterisation of the family of inference methods it belongs to. The paper also reports an example application to the task of building grammars for noun phrase extraction, a task that is crucial in many applications involving natural language processing. In the experiments, Partition Search improves parsing performance by up to 21.45% compared to a general baseline and by up to 3.48% compared to a task-specific baseline, while reducing grammar size by up to 17.25%.