Are very large context-free grammars tractable?

Authors:
Pierre Boullier;Benoît Sagot
Affiliations:
INRIA-Rocquencourt, Rocquencourt, Chesnay Cedex, France;INRIA-Rocquencourt, Rocquencourt, Chesnay Cedex, France
Venue:
IWPT '07 Proceedings of the 10th International Conference on Parsing Technologies
Year:
2007

Citing 12
Cited 0

Bidirectional context-free grammar parsing for natural language processing

Artificial Intelligence
Parsing techniques

Survey of the state of the art in human language technology
An efficient context-free parsing algorithm

Communications of the ACM
An efficient implementation of the head-corner parser

Computational Linguistics
Left-to-right parsing and bilexical context-free grammars

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Review of "Generalized LR parsing" by Masaru Tomita. Kluwer Academic Publishers 1991.

Computational Linguistics - Special issue on inheritance: II
Generalized left-corner parsing

EACL '93 Proceedings of the sixth conference on European chapter of the Association for Computational Linguistics
The structure of shared forests in ambiguous parsing

ACL '89 Proceedings of the 27th annual meeting on Association for Computational Linguistics
Parsing strategies with 'lexicalized' grammars: application to tree adjoining grammars

COLING '88 Proceedings of the 12th conference on Computational linguistics - Volume 2
Introduction to Automata Theory, Languages, and Computation (3rd Edition)

Introduction to Automata Theory, Languages, and Computation (3rd Edition)
Efficient and robust LFG parsing: SxLfg

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
From metagrammars to factorized TAG/TIG parsers

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present a method which, in practice, allows to use parsers for languages defined by very large context-free grammars (over a million symbol occurrences). The idea is to split the parsing process in two passes. A first pass computes a sub-grammar which is a specialized part of the large grammar selected by the input text and various filtering strategies. The second pass is a traditional parser which works with the sub-grammar and the input text. This approach is validated by practical experiments performed on a Earley-like parser running on a test set with two large context-free grammars.