Fast, greedy model minimization for unsupervised tagging

Authors:
Sujith Ravi;Ashish Vaswani;Kevin Knight;David Chiang
Affiliations:
University of Southern California;University of Southern California;University of Southern California;University of Southern California
Venue:
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Year:
2010

Citing 9
Cited 2

Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Tagging English text with a probabilistic model

Computational Linguistics
Contrastive estimation: training log-linear models on unlabeled data

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Semantic role labeling via integer linear programming inference

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
A new objective function for word alignment

ILP '09 Proceedings of the Workshop on Integer Linear Programming for Natural Langauge Processing
Attacking decipherment problems optimally with low-order N-gram models

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Global inference for sentence compression an integer linear programming approach

Journal of Artificial Intelligence Research
Concise integer linear programming formulations for dependency parsing

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Minimized models for unsupervised part-of-speech tagging

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1

Unsupervised parse selection for HPSG

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Type-supervised hidden Markov models for part-of-speech tagging with incomplete tag dictionaries

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Model minimization has been shown to work well for the task of unsupervised part-of-speech tagging with a dictionary. In (Ravi and Knight, 2009), the authors invoke an integer programming (IP) solver to do model minimization. However, solving this problem exactly using an integer programming formulation is intractable for practical purposes. We propose a novel two-stage greedy approximation scheme to replace the IP. Our method runs fast, while yielding highly accurate tagging results. We also compare our method against standard EM training, and show that we consistently obtain better tagging accuracies on test data of varying sizes for English and Italian.