Robust garden path parsing

  • Authors:
  • Brian Roark

  • Affiliations:
  • AT&T Labs - Research, 180 Park Avenue, Building 103, Room E145, Florham Park, NJ 07932-0971, USA e-mail: roark@research.att.com

  • Venue:
  • Natural Language Engineering
  • Year:
  • 2004

Quantified Score

Hi-index 0.01

Visualization

Abstract

This paper presents modifications to a standard probabilistic context-free grammar that enable a predictive parser to avoid garden pathing without resorting to any ad-hoc heuristic repair. The resulting parser is shown to apply efficiently to both newspaper text and telephone conversations with complete coverage and excellent accuracy. The distribution over trees is peaked enough to allow the parser to find parses efficiently, even with the much larger search space resulting from overgeneration. Empirical results are provided for both Wall St. Journal and Switchboard test corpora.