Parsing theory volume 2: LR(K) and LL(K) parsing
Parsing theory volume 2: LR(K) and LL(K) parsing
Statistical parsing of messages
HLT '90 Proceedings of the workshop on Speech and Natural Language
IEEE Transactions on Pattern Analysis and Machine Intelligence
Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems
Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems
Introduction to Formal Language Theory
Introduction to Formal Language Theory
Deterministic Techniques for Efficient Non-Deterministic Parsers
Proceedings of the 2nd Colloquium on Automata, Languages and Programming
Generalized probabilistic LR parsing of natural language (Corpora) with unification-based grammars
Computational Linguistics - Special issue on using large corpora: I
Estimation of probabilistic context-free grammars
Computational Linguistics
PCFG models of linguistic tree representations
Computational Linguistics
Statistical properties of probabilistic context-free grammars
Computational Linguistics
The structure of shared forests in ambiguous parsing
ACL '89 Proceedings of the 27th annual meeting on Association for Computational Linguistics
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Probabilistic parsing strategies
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Probabilistic parsing strategies
Journal of the ACM (JACM)
Hi-index | 0.00 |
We discuss existing approaches to train LR parsers, which have been used for statistical resolution of structural ambiguity. These approaches are nonoptimal, in the sense that a collection of probability distributions cannot be obtained. In particular, some probability distributions expressible in terms of a context-free grammar cannot be expressed in terms of the LR parser constructed from that grammar, under the restrictions of the existing approaches to training of LR parsers. We present an alternative way of training that is provably optimal, and that allows all probability distributions expressible in the context-free grammar to be carried over to the LR parser. We also demonstrate empirically that this kind of training can be effectively applied on a large treebank.