A view of the EM algorithm that justifies incremental, sparse, and other variants
Learning in graphical models
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Corpus-based induction of syntactic structure: models of dependency and constituency
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Annealing structural bias in multilingual weighted grammar induction
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Simple, robust, scalable semi-supervised learning via expectation regularization
Proceedings of the 24th international conference on Machine learning
Learning from measurements in exponential families
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Shared logistic normal distributions for soft parameter tying in unsupervised grammar induction
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Improving unsupervised dependency parsing with richer contexts and smoothing
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Alternating projections for learning with expectation constraints
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Posterior Regularization for Structured Latent Variable Models
The Journal of Machine Learning Research
Covariance in Unsupervised Learning of Probabilistic Grammars
The Journal of Machine Learning Research
Neutralizing linguistically problematic annotations in unsupervised dependency parsing evaluation
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Data point selection for cross-language adaptation of dependency parsers
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
From ranked words to dependency trees: two-stage unsupervised non-projective dependency parsing
TextGraphs-6 Proceedings of TextGraphs-6: Graph-based Methods for Natural Language Processing
Unsupervised structure prediction with non-parallel multilingual guidance
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Cross-lingual word clusters for direct transfer of linguistic structure
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Two baselines for unsupervised dependency parsing
WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Combining the sparsity and unambiguity biases for grammar induction
WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Unambiguity regularization for unsupervised learning of probabilistic grammars
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Improved parsing and POS tagging using inter-sentence consistency constraints
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Induction of dependency structures based on weighted projection
ICCCI'12 Proceedings of the 4th international conference on Computational Collective Intelligence: technologies and applications - Volume Part I
Hi-index | 0.00 |
A strong inductive bias is essential in unsupervised grammar induction. We explore a particular sparsity bias in dependency grammars that encourages a small number of unique dependency types. Specifically, we investigate sparsity-inducing penalties on the posterior distributions of parent-child POS tag pairs in the posterior regularization (PR) framework of Graça et al. (2007). In experiments with 12 languages, we achieve substantial gains over the standard expectation maximization (EM) baseline, with average improvement in attachment accuracy of 6.3%. Further, our method outperforms models based on a standard Bayesian sparsity-inducing prior by an average of 4.9%. On English in particular, we show that our approach improves on several other state-of-the-art techniques.