An alternative method of training probabilistic LR parsers

Authors:
Mark-Jan Nederhof;Giorgio Satta
Affiliations:
University of Groningen, The Netherlands;University of Padua, Padova, Italy
Venue:
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Year:
2004

Citing 14
Cited 1

Parsing theory volume 2: LR(K) and LL(K) parsing

Parsing theory volume 2: LR(K) and LL(K) parsing
Statistical parsing of messages

HLT '90 Proceedings of the workshop on Speech and Natural Language
Consistency of Stochastic Context-Free Grammars From Probabilistic Estimation Based on Growth Transformations

IEEE Transactions on Pattern Analysis and Machine Intelligence
Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems

Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems
Introduction to Formal Language Theory

Introduction to Formal Language Theory
Deterministic Techniques for Efficient Non-Deterministic Parsers

Proceedings of the 2nd Colloquium on Automata, Languages and Programming
Generalized probabilistic LR parsing of natural language (Corpora) with unification-based grammars

Computational Linguistics - Special issue on using large corpora: I
Estimation of probabilistic context-free grammars

Computational Linguistics
PCFG models of linguistic tree representations

Computational Linguistics
Statistical properties of probabilistic context-free grammars

Computational Linguistics
The structure of shared forests in ambiguous parsing

ACL '89 Proceedings of the 27th annual meeting on Association for Computational Linguistics
Efficient tabular LR parsing

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
A context-sensitive model for probabilistic LR parsing of spoken language with transformation-based postprocessing

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Probabilistic parsing strategies

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics

Probabilistic parsing strategies

Journal of the ACM (JACM)

Quantified Score

Hi-index	0.00

Visualization

Abstract

We discuss existing approaches to train LR parsers, which have been used for statistical resolution of structural ambiguity. These approaches are nonoptimal, in the sense that a collection of probability distributions cannot be obtained. In particular, some probability distributions expressible in terms of a context-free grammar cannot be expressed in terms of the LR parser constructed from that grammar, under the restrictions of the existing approaches to training of LR parsers. We present an alternative way of training that is provably optimal, and that allows all probability distributions expressible in the context-free grammar to be carried over to the LR parser. We also demonstrate empirically that this kind of training can be effectively applied on a large treebank.