Chart pruning for fast lexicalised-grammar parsing

Authors:
Yue Zhanga;Byung-Gyu Ahn;Stephen Clark;Curt Van Wyk;James R. Curran;Laura Rimell
Affiliations:
Computer Laboratory, Cambridge;Computer Science, Johns Hopkins;Computer Laboratory, Cambridge;Computer Science, Northwestern College;School of IT, Sydney;Computer Laboratory, Cambridge
Venue:
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Year:
2010

Citing 19
Cited 6

The syntactic process

The syntactic process
Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition

Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition
Head-driven statistical models for natural language parsing

Head-driven statistical models for natural language parsing
Supertagging: an approach to almost parsing

Computational Linguistics
A maximum-entropy-inspired parser

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Investigating GIS and smoothing for maximum entropy taggers

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Coarse-to-fine n-best parsing and MaxEnt discriminative reranking

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Deterministic dependency parsing of English text

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
The importance of supertagging for wide-coverage CCG parsing

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Wide-coverage semantic representations from a CCG parser

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Effective self-training for parsing

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Evaluating the accuracy of an unlexicalized statistical parser on the PARC DepBank

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
CCGbank: A Corpus of CCG Derivations and Dependency Structures Extracted from the Penn Treebank

Computational Linguistics
Wide-coverage efficient statistical parsing with ccg and log-linear models

Computational Linguistics
Linear complexity context-free parsing pipelines via chart constraints

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Improving the efficiency of a wide-coverage CCG parser

IWPT '07 Proceedings of the 10th International Conference on Parsing Technologies
Identifying interesting assertions from the web

Proceedings of the 18th ACM conference on Information and knowledge management
Efficacy of beam thresholding, unification filtering and hybrid parsing in probabilistic HPSG parsing

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Faster parsing by supertagger adaptation

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

Computational linguistics and natural language processing

CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part I
Beam-width prediction for efficient context-free parsing

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Efficient CCG parsing: A* versus adaptive supertagging

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Joint training of dependency parsing filters through latent support vector machines

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Unary constraints for efficient context-free parsing

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Finite-state chart constraints for reduced complexity context-free parsing pipelines

Computational Linguistics

Quantified Score

Hi-index	0.01

Visualization

Abstract

Given the increasing need to process massive amounts of textual data, efficiency of NLP tools is becoming a pressing concern. Parsers based on lexicalised grammar formalisms, such as TAG and CCG, can be made more efficient using supertagging, which for CCG is so effective that every derivation consistent with the supertagger output can be stored in a packed chart. However, wide-coverage CCG parsers still produce a very large number of derivations for typical newspaper or Wikipedia sentences. In this paper we investigate two forms of chart pruning, and develop a novel method for pruning complete cells in a parse chart. The result is a wide-coverage CCG parser that can process almost 100 sentences per second, with little or no loss in accuracy over the baseline with no pruning.