Sparse multi-scale grammars for discriminative latent variable parsing

Authors:
Slav Petrov;Dan Klein
Affiliations:
University of California at Berkeley, Berkeley, CA;University of California at Berkeley, Berkeley, CA
Venue:
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Year:
2008

Citing 19
Cited 8

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Head-driven statistical models for natural language parsing

Head-driven statistical models for natural language parsing
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
PCFG models of linguistic tree representations

Computational Linguistics
A maximum-entropy-inspired parser

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
An annotation scheme for free word order languages

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Joint and conditional estimation of tagging and parsing models

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Accurate unlexicalized parsing

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Discriminative training of a neural network statistical parser

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Parsing the WSJ using CCG and log-linear models

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Probabilistic CFG with latent annotations

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Coarse-to-fine n-best parsing and MaxEnt discriminative reranking

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Lexicalization in crosslinguistic probabilistic parsing: the case of French

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Learning accurate, compact, and interpretable tree annotation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Hidden-variable models for discriminative reranking

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Multilevel coarse-to-fine PCFG parsing

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Probabilistic context-free grammar induction based on structural zeros

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Scalable training of L1-regularized log-linear models

Proceedings of the 24th international conference on Machine learning
Weighted and probabilistic context-free grammars are equally expressive

Computational Linguistics

Self-training PCFG grammars with latent annotations across languages

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Products of random latent variable grammars

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Simple, accurate parsing with an all-fragments grammar

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Inducing sentence structure from parallel corpora for reordering

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Dual decomposition with many overlapping components

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Structured sparsity in structured prediction

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Higher-order constituent parsing and parser combination

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Training factored PCFGs with expectation propagation

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a discriminative, latent variable approach to syntactic parsing in which rules exist at multiple scales of refinement. The model is formally a latent variable CRF grammar over trees, learned by iteratively splitting grammar productions (not categories). Different regions of the grammar are refined to different degrees, yielding grammars which are three orders of magnitude smaller than the single-scale baseline and 20 times smaller than the split-and-merge grammars of Petrov et al. (2006). In addition, our discriminative approach integrally admits features beyond local tree configurations. We present a multiscale training method along with an efficient CKY-style dynamic program. On a variety of domains and languages, this method produces the best published parsing accuracies with the smallest reported grammars.