Inducing compact but accurate tree-substitution grammars

  • Authors:
  • Trevor Cohn;Sharon Goldwater;Phil Blunsom

  • Affiliations:
  • University of Edinburgh, Edinburgh, Scotland, United Kingdom;University of Edinburgh, Edinburgh, Scotland, United Kingdom;University of Edinburgh, Edinburgh, Scotland, United Kingdom

  • Venue:
  • NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Tree substitution grammars (TSGs) are a compelling alternative to context-free grammars for modelling syntax. However, many popular techniques for estimating weighted TSGs (under the moniker of Data Oriented Parsing) suffer from the problems of inconsistency and over-fitting. We present a theoretically principled model which solves these problems using a Bayesian non-parametric formulation. Our model learns compact and simple grammars, uncovering latent linguistic structures (e.g., verb subcategorisation), and in doing so far out-performs a standard PCFG.