Class-based n-gram models of natural language
Computational Linguistics
Algorithms for bigram and trigram word clustering
Speech Communication
A view of the EM algorithm that justifies incremental, sparse, and other variants
Learning in graphical models
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Tagging English text with a probabilistic model
Computational Linguistics
Distributional part-of-speech tagging
EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
Combining distributional and morphological information for part of speech induction
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Feature-rich part-of-speech tagging with a cyclic dependency network
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Bootstrapping parsers via syntactic projection across parallel texts
Natural Language Engineering
Contrastive estimation: training log-linear models on unlabeled data
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Toward unsupervised whole-corpus tagging
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Prototype-driven learning for sequence models
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Comparing clusterings---an information based distance
Journal of Multivariate Analysis
The NVI clustering evaluation measure
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Evaluating unsupervised part-of-speech tagging for grammar induction
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
A comparison of Bayesian estimators for unsupervised Hidden Markov Model POS taggers
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Unsupervised multilingual learning for POS tagging
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Minimized models for unsupervised part-of-speech tagging
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
A simple unsupervised learner for POS disambiguation rules given only a minimal lexicon
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Painless unsupervised learning with features
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
SVD and clustering for unsupervised POS tagging
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Posterior Regularization for Structured Latent Variable Models
The Journal of Machine Learning Research
Crouching Dirichlet, hidden Markov model: unsupervised POS tagging with context local tag generation
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Two decades of unsupervised POS induction: how far have we come?
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Latent-descriptor clustering for unsupervised POS induction
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Simple type-level unsupervised POS tagging
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
The PASCAL Challenge on Grammar Induction
WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Wiki-ly supervised part-of-speech tagging
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Hi-index | 0.00 |
We consider the problem of fully unsupervised learning of grammatical (part-of-speech) categories from unlabeled text. The standard maximum-likelihood hidden Markov model for this task performs poorly, because of its weak inductive bias and large model capacity. We address this problem by refining the model and modifying the learning objective to control its capacity via parametric and non-parametric constraints. Our approach enforces word-category association sparsity, adds morphological and orthographic features, and eliminates hard-to-estimate parameters for rare words. We develop an efficient learning algorithm that is not much more computationally intensive than standard training. We also provide an open-source implementation of the algorithm. Our experiments on five diverse languages (Bulgarian, Danish, English, Portuguese, Spanish) achieve significant improvements compared with previous methods for the same task.