Chart pruning for fast lexicalised-grammar parsing

  • Authors:
  • Yue Zhanga;Byung-Gyu Ahn;Stephen Clark;Curt Van Wyk;James R. Curran;Laura Rimell

  • Affiliations:
  • Computer Laboratory, Cambridge;Computer Science, Johns Hopkins;Computer Laboratory, Cambridge;Computer Science, Northwestern College;School of IT, Sydney;Computer Laboratory, Cambridge

  • Venue:
  • COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

Given the increasing need to process massive amounts of textual data, efficiency of NLP tools is becoming a pressing concern. Parsers based on lexicalised grammar formalisms, such as TAG and CCG, can be made more efficient using supertagging, which for CCG is so effective that every derivation consistent with the supertagger output can be stored in a packed chart. However, wide-coverage CCG parsers still produce a very large number of derivations for typical newspaper or Wikipedia sentences. In this paper we investigate two forms of chart pruning, and develop a novel method for pruning complete cells in a parse chart. The result is a wide-coverage CCG parser that can process almost 100 sentences per second, with little or no loss in accuracy over the baseline with no pruning.