Painless unsupervised learning with features

Authors:
Taylor Berg-Kirkpatrick;Alexandre Bouchard-Côté;John DeNero;Dan Klein
Affiliations:
University of California at Berkeley;University of California at Berkeley;University of California at Berkeley;University of California at Berkeley
Venue:
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Year:
2010

Citing 21
Cited 35

On the limited memory BFGS method for large scale optimization

Mathematical Programming: Series A and B
A view of the EM algorithm that justifies incremental, sparse, and other variants

Learning in graphical models
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Tagging English text with a probabilistic model

Computational Linguistics
Does Baum-Welch re-estimation help taggers?

ANLC '94 Proceedings of the fourth conference on Applied natural language processing
HMM-based word alignment in statistical translation

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Building a large-scale annotated Chinese corpus

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Refined lexicon models for statistical machine translation using a maximum entropy approach

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Improved statistical alignment models

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Corpus-based induction of syntactic structure: models of dependency and constituency

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Contrastive estimation: training log-linear models on unlabeled data

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Contextual dependencies in unsupervised word segmentation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
NeurAlign: combining word alignments using neural networks

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Alignment by agreement

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Prototype-driven learning for sequence models

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Weighted and probabilistic context-free grammars are equally expressive

Computational Linguistics
Shared logistic normal distributions for soft parameter tying in unsupervised grammar induction

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Improving nonparameteric Bayesian inference: experiments on unsupervised word segmentation with adaptor grammars

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Online EM for unsupervised models

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Better word alignments with supervised ITG models

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2

Phylogenetic grammar induction

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Two decades of unsupervised POS induction: how far have we come?

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Latent-descriptor clustering for unsupervised POS induction

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Simple type-level unsupervised POS tagging

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Covariance in Unsupervised Learning of Probabilistic Grammars

The Journal of Machine Learning Research
Unsupervised word alignment with arbitrary features

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Unsupervised part-of-speech tagging with bilingual graph-based projections

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Neutralizing linguistically problematic annotations in unsupervised dependency parsing evaluation

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
A hierarchical Pitman-Yor process HMM for unsupervised part of speech induction

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Unsupervised bilingual morpheme segmentation and alignment with context-rich hidden semi-Markov models

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Simple unsupervised grammar induction from raw text with cascaded finite state models

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Models and training for unsupervised preposition sense disambiguation

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Why initialization matters for IBM model 1: multiple optima and non-strict convexity

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Punctuation: making a point in unsupervised dependency parsing

CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Unsupervised Russian POS tagging with appropriate context

TSD'11 Proceedings of the 14th international conference on Text, speech and dialogue
Controlling complexity in part-of-speech induction

Journal of Artificial Intelligence Research
Unsupervised bilingual POS tagging with Markov random fields

EMNLP '11 Proceedings of the First Workshop on Unsupervised Learning in NLP
Unsupervised structure prediction with non-parallel multilingual guidance

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A Bayesian mixture model for part-of-speech induction using multiple features

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Lateen EM: unsupervised training with multiple objectives, applied to dependency grammar induction

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Dyna: extending datalog for modern AI

Datalog'10 Proceedings of the First international conference on Datalog Reloaded
Adaptive Bayesian HMM for Fully Unsupervised Chinese Part-of-Speech Induction

ACM Transactions on Asian Language Information Processing (TALIP)
Cross-lingual genre classification

EACL '12 Proceedings of the Student Research Workshop at the 13th Conference of the European Chapter of the Association for Computational Linguistics
A hierarchical dirichlet process model for joint part-of-speech and morphology induction

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Concavity and initialization for unsupervised dependency parsing

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Unified expectation maximization

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Robust induction of parts-of-speech in child-directed language by co-clustering of words and contexts

ROBUS-UNSUP '12 Proceedings of the Joint Workshop on Unsupervised and Semi-Supervised Learning in NLP
Automatic event extraction with structured preference modeling

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
A cost sensitive part-of-speech tagging: differentiating serious errors from minor errors

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
A feature-rich constituent context model for grammar induction

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Three dependency-and-boundary models for grammar induction

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Learning syntactic categories using paradigmatic representations of word context

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Probabilistic finite state machines for regression-based MT evaluation

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Wiki-ly supervised part-of-speech tagging

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Semantic separator learning and its applications in unsupervised Chinese text parsing

Frontiers of Computer Science: Selected Publications from Chinese Universities

Quantified Score

Hi-index	0.01

Visualization

Abstract

We show how features can easily be added to standard generative models for unsupervised learning, without requiring complex new training methods. In particular, each component multinomial of a generative model can be turned into a miniature logistic regression model if feature locality permits. The intuitive EM algorithm still applies, but with a gradient-based M-step familiar from discriminative training of logistic regression models. We apply this technique to part-of-speech induction, grammar induction, word alignment, and word segmentation, incorporating a few linguistically-motivated features into the standard generative model for each task. These feature-enhanced models each outperform their basic counterparts by a substantial margin, and even compete with and surpass more complex state-of-the-art models.