Computation of the N Best Parse Trees for Weighted and Stochastic Context-Free Grammars

Authors:
Víctor M. Jiménez;Andrés Marzal
Affiliations:
-;-
Venue:
Proceedings of the Joint IAPR International Workshops on Advances in Pattern Recognition
Year:
2000

Citing 4
Cited 3

Optimal Probabilistic Evaluation Functions for Search Controlled by Stochastic Context-Free Grammars

IEEE Transactions on Pattern Analysis and Machine Intelligence
Survey of the state of the art in human language technology

Survey of the state of the art in human language technology
Introduction to Formal Language Theory

Introduction to Formal Language Theory
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II

Querying parse trees of stochastic context-free grammars

Proceedings of the 13th International Conference on Database Theory
Top-down k-best A* parsing

ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Towards probabilistic acceptors and transducers for feature structures

SSST-6 '12 Proceedings of the Sixth Workshop on Syntax, Semantics and Structure in Statistical Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Context-Free Grammars are the object of increasing interest in the pattern recognition research community in an attempt to overcome the limited modeling capabilities of the simpler regular grammars, and have application in a variety of fields such as language modeling, speech recognition, optical character recognition, computational biology, etc. This paper proposes an efficient algorithm to solve one of the problems associated to the use of weighted and stochastic Context-Free Grammars: the problem of computing the N best parse trees of a given string. After the best parse tree has been computed using the CYK algorithm, a large number of alternative parse trees are obtained, in order by weight (or probability), in a small fraction of the time required by the CYK algorithm to find the best parse tree. This is confirmed by experimental results using grammars from two different domains: a chromosome grammar, and a grammar modeling natural language sentences from the Wall Street Journal corpus.