Posterior Sparsity in Unsupervised Dependency Parsing

Authors:
Jennifer Gillenwater;Kuzman Ganchev;João Graça;Fernando Pereira;Ben Taskar
Affiliations:
-;-;-;-;-
Venue:
The Journal of Machine Learning Research
Year:
2011

Citing 17
Cited 7

A view of the EM algorithm that justifies incremental, sparse, and other variants

Learning in graphical models
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
An Analysis of the EM Algorithm and Entropy-Like Proximal Point Methods

Mathematics of Operations Research
Corpus-based induction of syntactic structure: models of dependency and constituency

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Annealing structural bias in multilingual weighted grammar induction

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Novel estimation methods for unsupervised discovery of latent structure in natural language text

Novel estimation methods for unsupervised discovery of latent structure in natural language text
Simple, robust, scalable semi-supervised learning via expectation regularization

Proceedings of the 24th international conference on Machine learning
Learning from measurements in exponential families

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Automatic selection of high quality parses created by a fully unsupervised parser

CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Evaluating unsupervised part-of-speech tagging for grammar induction

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Shared logistic normal distributions for soft parameter tying in unsupervised grammar induction

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Improving unsupervised dependency parsing with richer contexts and smoothing

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Alternating projections for learning with expectation constraints

UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
From baby steps to Leapfrog: how "Less is More" in unsupervised dependency parsing

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Minimized models and grammar-informed initialization for supertagging with highly ambiguous lexicons

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Posterior Regularization for Structured Latent Variable Models

The Journal of Machine Learning Research
Learning tractable word alignment models with complex constraints

Computational Linguistics

Rich prior knowledge in learning for NLP

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts of ACL 2011
Introduction to the Special Topic on Grammar Induction, Representation of Language and Language Learning

The Journal of Machine Learning Research
Computational Models of First Language Acquisition Special Issue of Research on Language and Computation

Research on Language and Computation
Concavity and initialization for unsupervised dependency parsing

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Fast unsupervised dependency parsing with arc-standard transitions

ROBUS-UNSUP '12 Proceedings of the Joint Workshop on Unsupervised and Semi-Supervised Learning in NLP
Unsupervised dependency parsing using reducibility and fertility features

WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Exploiting reducibility in unsupervised dependency parsing

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

A strong inductive bias is essential in unsupervised grammar induction. In this paper, we explore a particular sparsity bias in dependency grammars that encourages a small number of unique dependency types. We use part-of-speech (POS) tags to group dependencies by parent-child types and investigate sparsity-inducing penalties on the posterior distributions of parent-child POS tag pairs in the posterior regularization (PR) framework of Graça et al. (2007). In experiments with 12 different languages, we achieve significant gains in directed attachment accuracy over the standard expectation maximization (EM) baseline, with an average accuracy improvement of 6.5%, outperforming EM by at least 1% for 9 out of 12 languages. Furthermore, the new method outperforms models based on standard Bayesian sparsity-inducing parameter priors with an average improvement of 5% and positive gains of at least 1% for 9 out of 12 languages. On English text in particular, we show that our approach improves performance over other state-of-the-art techniques.