Optimal Probabilistic Evaluation Functions for Search Controlled by Stochastic Context-Free Grammars

Authors:
Anna Corazza;Renato De Mori;Roberto Gretter;Giorgio Satta
Affiliations:
-;-;-;-
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
1994

Citing 5
Cited 5

Computation of Probabilities for an Island-Driven Parser

IEEE Transactions on Pattern Analysis and Machine Intelligence
Probabilistic Languages: A Review and Some Open Questions

ACM Computing Surveys (CSUR)
Introduction to Formal Language Theory

Introduction to Formal Language Theory
The Theory of Parsing, Translation, and Compiling

The Theory of Parsing, Translation, and Compiling
Computation of the probability of initial substring generation by stochastic context-free grammars

Computational Linguistics

Computation of the N Best Parse Trees for Weighted and Stochastic Context-Free Grammars

Proceedings of the Joint IAPR International Workshops on Advances in Pattern Recognition
Monte-Carlo Sampling for NP-Hard Maximization Problems in the Framework of Weighted Parsing

NLP '00 Proceedings of the Second International Conference on Natural Language Processing
A parsing: fast exact Viterbi parse selection

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Computation of distances for regular and context-free probabilistic languages

Theoretical Computer Science
Heuristic search for non-bottom-up tree structure prediction

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing

Quantified Score

Hi-index	0.14

Visualization

Abstract

The possibility of using stochastic context-free grammars (SCFG's) in language modeling (LM) has been considered previously. When these grammars are used, search can be directed by evaluation functions based on the probabilities that a SCFG generates a sentence, given only some words in it. Expressions for computing the evaluation function have been proposed by Jelinek and Lafferty (1991) for the recognition of word sequences in the case in which only the prefix of a sequence is known. Corazza et al. (1991) have proposed methods for probability computation in the more general case in which partial word sequences interleaved by gaps are known. This computation is too complex in practice unless the lengths of the gaps are known. This paper proposes a method for computing the probability of the best parse tree that can generate a sentence only part of which (consisting of islands and gaps) is known. This probability is the minimum possible, and thus the most informative, upper-bound that can be used in the evaluation function. The computation of the proposed upper-bound has cubic time complexity even if the lengths of the gaps are unknown. This makes possible the practical use of SCFG for driving interpretations of sentences in natural language processing.