An efficient probabilistic context-free parsing algorithm that computes prefix probabilities

Authors:
Andreas Stolcke
Affiliations:
University of California at Berkeley
Venue:
Computational Linguistics
Year:
1995

Citing 13
Cited 86

Prolog and natural-language analysis

Prolog and natural-language analysis
A parsing algorithm for weighted grammars and substring recognition

Syntactic and structural pattern recognition
Modification of Earley's algorithm for speech recognition

Proceedings of the NATO Advanced Study Institute on Recent advances in speech understanding and dialog systems
Computation of Probabilities for an Island-Driven Parser

IEEE Transactions on Pattern Analysis and Machine Intelligence
An Improved Context-Free Recognizer

ACM Transactions on Programming Languages and Systems (TOPLAS)
An efficient context-free parsing algorithm

Communications of the ACM
Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems

Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems
The Theory of Parsing, Translation, and Compiling

The Theory of Parsing, Translation, and Compiling
Computation of the probability of initial substring generation by stochastic context-free grammars

Computational Linguistics
Generalized probabilistic LR parsing of natural language (Corpora) with unification-based grammars

Computational Linguistics - Special issue on using large corpora: I
Precise n-gram probabilities from stochastic context-free grammars

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Efficiency, robustness and accuracy in Picky chart parsing

ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics
Inside-outside reestimation from partially bracketed corpora

ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics

Approximation algorithms for protein folding prediction

Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Recognition of Visual Activities and Interactions by Stochastic Parsing

IEEE Transactions on Pattern Analysis and Machine Intelligence
On Sufficient Conditions to Identify in the Limit Classes of Grammars from Polynomial Time and Data

ICGI '02 Proceedings of the 6th International Colloquium on Grammatical Inference: Algorithms and Applications
Stochastic k-testable Tree Languages and Applications

ICGI '02 Proceedings of the 6th International Colloquium on Grammatical Inference: Algorithms and Applications
Gibbsian Context-Free Grammar for Parsing

TSD '02 Proceedings of the 5th International Conference on Text, Speech and Dialogue
Approximation Algorithms for String Folding Problems

TCS '00 Proceedings of the International Conference IFIP on Theoretical Computer Science, Exploring New Frontiers of Theoretical Informatics
Machine Learning in Human Language Technology

Machine Learning and Its Applications, Advanced Lectures
EM Learning for Symbolic-Statistical Models in Statistical Abduction

Progress in Discovery Science, Final Report of the Japanese Discovery Science Project
Simplified Training Algorithms for Hierarchical Hidden Markov Models

DS '01 Proceedings of the 4th International Conference on Discovery Science
Monte-Carlo Sampling for NP-Hard Maximization Problems in the Framework of Weighted Parsing

NLP '00 Proceedings of the Second International Conference on Natural Language Processing
Weighted deductive parsing and Knuth's algorithm

Computational Linguistics
Probabilistic top-down parsing and language modeling

Computational Linguistics
Estimation of probabilistic context-free grammars

Computational Linguistics
Prefix probabilities from stochastic Tree Adjoining Grammars

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Compacting the Penn Treebank grammar

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
A fully statistical approach to natural language interfaces

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Introduction to the special issue on statistical language modeling

ACM Transactions on Asian Language Information Processing (TALIP)
A hybrid language model based on a combination of N-grams and stochastic context-free grammars

ACM Transactions on Asian Language Information Processing (TALIP)
Parsing with Probabilistic Strictly Locally Testable Tree Languages

IEEE Transactions on Pattern Analysis and Machine Intelligence
Immediate-head parsing for language models

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
A probabilistic earley parser as a psycholinguistic model

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
A structured language model based on context-sensitive probabilistic left-corner parsing

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Probabilistic grounding of situated speech using plan recognition and reference resolution

ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
Unsupervised induction of stochastic context-free grammars using distributional clustering

ConLL '01 Proceedings of the 2001 workshop on Computational Natural Language Learning - Volume 7
Parsing and hypergraphs

New developments in parsing technology
Probabilistic parsing strategies

Journal of the ACM (JACM)
Probabilistic parsing strategies

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Discriminative syntactic language modeling for speech recognition

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Integrating syntactic priming into an incremental probabilistic parser, with an application to psycholinguistic modeling

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Using string-kernels for learning semantic parsers

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Compiling Comp Ling: practical weighted dynamic programming and the Dyna language

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Learning for semantic parsing with statistical machine translation

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
A survey of advances in vision-based human motion capture and analysis

Computer Vision and Image Understanding - Special issue on modeling people: Vision-based understanding of a person's shape, appearance, movement, and behaviour
Structural hidden Markov models: An application to handwritten numeral recognition

Intelligent Data Analysis
Dependency structure language model for topic detection and tracking

Information Processing and Management: an International Journal
Corpus based learning of stochastic, context-free grammars combined with Hidden Markov Models for tRNA modelling

International Journal of Bioinformatics Research and Applications
Computation of distances for regular and context-free probabilistic languages

Theoretical Computer Science
Fast Stochastic Context-Free Parsing: A Stochastic Version of the Valiant Algorithm

IbPRIA '07 Proceedings of the 3rd Iberian conference on Pattern Recognition and Image Analysis, Part I
Relevant Representations for the Inference of Rational Stochastic Tree Languages

ICGI '08 Proceedings of the 9th international colloquium on Grammatical Inference: Algorithms and Applications
Learning for Semantic Parsing

CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
Generative Modeling by PRISM

ICLP '09 Proceedings of the 25th International Conference on Logic Programming
Lattice parsing to integrate speech recognition and rule-based machine translation

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
A model of local coherence effects in human sentence processing as consequences of updates from bottom-up prior to posterior beliefs

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Parameter learning of logic programs for symbolic-statistical modeling

Journal of Artificial Intelligence Research
Learning context-free grammar using improved tabular representation

Applied Soft Computing
Understanding video events: a survey of methods for automatic interpretation of semantic occurrences in video

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Deriving lexical and syntactic expectation-based measures for psycholinguistic modeling via incremental top-down parsing

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
A Bayesian model of syntax-directed tree to string grammar induction

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Structural hidden Markov models based on stochastic context-free grammars

Control and Intelligent Systems
A bibliographical study of grammatical inference

Pattern Recognition
Evolutionary induction of stochastic context free grammars

Pattern Recognition
Smoothing and compression with stochastic k-testable tree languages

Pattern Recognition
Using hidden Markov models for recognizing action primitives in complex actions

SCIA'07 Proceedings of the 15th Scandinavian conference on Image analysis
Automatic, context-of-capture-based categorization, structure detection and segmentation of news telecasts

DELOS'07 Proceedings of the 1st international conference on Digital libraries: research and development
New advances in logic-based probabilistic modeling by PRISM

Probabilistic inductive logic programming
Querying parse trees of stochastic context-free grammars

Proceedings of the 13th International Conference on Database Theory
Dynamic programming for linear-time incremental parsing

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
The influence of discourse on syntax a psycholinguistic model of sentence processing

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Training a multilingual sportscaster: using perceptual context to learn language

Journal of Artificial Intelligence Research
Modeling the noun phrase versus sentence coordination ambiguity in Dutch: evidence from surprisal theory

CMCL '10 Proceedings of the 2010 Workshop on Cognitive Modeling and Computational Linguistics
Towards 3D modeling of interacting TM helix pairs based on classification of helix pair sequence

PRIB'10 Proceedings of the 5th IAPR international conference on Pattern recognition in bioinformatics
Comparing local and sequential models for statistical incremental natural language understanding

SIGDIAL '10 Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Behavioural analysis with movement cluster model for concurrent actions

Journal on Image and Video Processing - Special issue on advanced video-based surveillance
Prefix probability for probabilistic synchronous context-free grammars

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Integrating surprisal and uncertain-input models in online sentence comprehension: formal techniques and empirical results

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Intent inference via syntactic tracking

Digital Signal Processing
Classifying melodies using tree grammars

IbPRIA'11 Proceedings of the 5th Iberian conference on Pattern recognition and image analysis
Probabilistic hierarchical planning over MDPs

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Complex activity representation and recognition by extended stochastic grammar

ACCV'06 Proceedings of the 7th Asian conference on Computer Vision - Volume Part I
Recognizing action primitives in complex actions using hidden markov models

ISVC'06 Proceedings of the Second international conference on Advances in Visual Computing - Volume Part I
Time reduction of stochastic parsing with stochastic context-free grammars

IbPRIA'05 Proceedings of the Second Iberian conference on Pattern Recognition and Image Analysis - Volume Part II
Performance of a SCFG-based language model with training data sets of increasing size

IbPRIA'05 Proceedings of the Second Iberian conference on Pattern Recognition and Image Analysis - Volume Part II
Computation of infix probabilities for probabilistic context-free grammars

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Extending stochastic context-free grammars for an application in bioinformatics

LATA'10 Proceedings of the 4th international conference on Language and Automata Theory and Applications
Data driven approaches to speech and language processing

Nonlinear Speech Modeling and Applications
Finding the most probable string and the consensus string: an algorithmic study

IWPT '11 Proceedings of the 12th International Conference on Parsing Technologies
Prefix probabilities for linear context-free rewriting systems

IWPT '11 Proceedings of the 12th International Conference on Parsing Technologies
Parsing of partially bracketed structures for parse selection

IWPT '11 Proceedings of the 12th International Conference on Parsing Technologies
Graphical EM for on-line learning of grammatical probabilities in radar Electronic Support

Applied Soft Computing
Lexical surprisal as a general predictor of reading time

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Preservation of recognizability for weighted linear extended top-down tree transducers

ATANLP '12 Proceedings of the Workshop on Applications of Tree Automata Techniques in Natural Language Processing
Sequential vs. hierarchical syntactic models of human incremental sentence processing

CMCL '12 Proceedings of the 3rd Workshop on Cognitive Modeling and Computational Linguistics
Head-driven transition-based parsing with top-down prediction

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
A pushdown transducer extension for the openfst library

CIAA'12 Proceedings of the 17th international conference on Implementation and Application of Automata
Recognition of long-term behaviors by parsing sequences of short-term actions with a stochastic regular grammar

SSPR'12/SPR'12 Proceedings of the 2012 Joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition
Stochastic context-free grammars, regular languages, and newton's method

ICALP'13 Proceedings of the 40th international conference on Automata, Languages, and Programming - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe an extension of Earley's parser for stochastic context-free grammars that computes the following quantities given a stochastic context-free grammar and an input string: a) probabilities of successive prefixes being generated by the grammar; b) probabilities of substrings being generated by the nonterminals, including the entire string being generated by the grammar; c) most likely (Viterbi) parse of the string; d) posterior expected number of applications of each grammar production, as required for reestimating rule probabilities. Probabilities (a) and (b) are computed incrementally in a single left-to-right pass over the input. Our algorithm compares favorably to standard bottom-up parsing methods for SCFGs in that it works efficiently on sparse grammars by making use of Earley's top-down control structure. It can process any context-free rule format without conversion to some normal form, and combines computations for (a) through (d) in a single algorithm. Finally, the algorithm has simple extensions for processing partially bracketed inputs, and for finding partial parses and their likelihoods on ungrammatical inputs.