Constructing parse forests that include exactly the n-best PCFG trees

Authors:
Pierre Boullier;Alexis Nasr;Benoît Sagot
Affiliations:
INRIA Paris-Rocquencourt & Universitéé Paris, Le Chesnay Cedex, France;Univ. de la Méditerrannée, Marseille Cedex, France;INRIA Paris-Rocquencourt & Universitéé Paris, Le Chesnay Cedex, France
Venue:
IWPT '09 Proceedings of the 11th International Conference on Parsing Technologies
Year:
2009

Citing 7
Cited 0

An efficient context-free parsing algorithm

Communications of the ACM
The theory of parsing, translation, and compiling

The theory of parsing, translation, and compiling
Deterministic Techniques for Efficient Non-Deterministic Parsers

Proceedings of the 2nd Colloquium on Automata, Languages and Programming
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Probabilistic representation of formal languages

SWAT '69 Proceedings of the 10th Annual Symposium on Switching and Automata Theory (swat 1969)
Efficient and robust LFG parsing: SxLfg

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Better k-best parsing

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes and compares two algorithms that take as input a shared PCFG parse forest and produce shared forests that contain exactly the n most likely trees of the initial forest. Such forests are suitable for subsequent processing, such as (some types of) reranking or LFG f-structure computation, that can be performed ontop of a shared forest, but that may have a high (e.g., exponential) complexity w.r.t. the number of trees contained in the forest. We evaluate the performances of both algorithms on real-scale NLP forests generated with a PCFG extracted from the Penn Treebank.