A statistical approach to machine translation
Computational Linguistics
Elements of information theory
Elements of information theory
Deterministic annealing EM algorithm
Neural Networks
A view of the EM algorithm that justifies incremental, sparse, and other variants
Learning in graphical models
Statistical Language Learning
Tagging English text with a probabilistic model
Computational Linguistics
Does Baum-Welch re-estimation help taggers?
ANLC '94 Proceedings of the fourth conference on Applied natural language processing
Distributional clustering of English words
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Unsupervised word sense disambiguation rivaling supervised methods
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
A generative constituent-context model for improved grammar induction
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Minimally supervised induction of grammatical gender
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Dyna: a declarative language for implementing dynamic programs
ACLdemo '04 Proceedings of the ACL 2004 on Interactive poster and demonstration sessions
Learning Hidden Variable Networks: The Information Bottleneck Approach
The Journal of Machine Learning Research
Contrastive estimation: training log-linear models on unlabeled data
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Annealing structural bias in multilingual weighted grammar induction
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Compiling Comp Ling: practical weighted dynamic programming and the Dyna language
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Minimum risk annealing for training log-linear models
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Data-driven dependency parsing of new languages using incomplete and noisy training data
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Evaluating unsupervised part-of-speech tagging for grammar induction
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
From baby steps to Leapfrog: how "Less is More" in unsupervised dependency parsing
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Collocation extraction beyond the independence assumption
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Improved unsupervised POS induction using intrinsic clustering quality and a Zipfian constraint
CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Type level clustering evaluation: new measures and a POS induction case study
CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Simple unsupervised grammar induction from raw text with cascaded finite state models
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Lateen EM: unsupervised training with multiple objectives, applied to dependency grammar induction
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Computational models of language acquisition
CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Unified expectation maximization
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Unambiguity regularization for unsupervised learning of probabilistic grammars
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Smoothing for bracketing induction
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Hi-index | 0.00 |
Exploiting unannotated natural language data is hard largely because unsupervised parameter estimation is hard. We describe deterministic annealing (Rose et al., 1990) as an appealing alternative to the Expectation-Maximization algorithm (Dempster et al., 1977). Seeking to avoid search error, DA begins by globally maximizing an easy concave function and maintains a local maximum as it gradually morphs the function into the desired non-concave likelihood function. Applying DA to parsing and tagging models is shown to be straightforward; significant improvements over EM are shown on a part-of-speech tagging task. We describe a variant, skewed DA, which can incorporate a good initializer when it is available, and show significant improvements over EM on a grammar induction task.