Efficient graph-based semi-supervised learning of structured tagging models

Authors:
Amarnag Subramanya;Slav Petrov;Fernando Pereira
Affiliations:
Google Research, Mountain View, CA;Google Research, New York, NY;Google Research, Mountain View, CA
Venue:
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Year:
2010

Citing 14
Cited 13

Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Transductive Inference for Text Classification using Support Vector Machines

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Unsupervised word sense disambiguation rivaling supervised methods

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Feature-rich part-of-speech tagging with a cyclic dependency network

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Beyond the point cloud: from transductive to semi-supervised learning

ICML '05 Proceedings of the 22nd international conference on Machine learning
Semi-supervised learning for structured output variables

ICML '06 Proceedings of the 23rd international conference on Machine learning
QuestionBank: creating a corpus of parse-annotated questions

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Large Scale Transductive SVMs

The Journal of Machine Learning Research
Domain adaptation with structural correspondence learning

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Graph-based learning for statistical machine translation

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Distributional representations for handling sparsity in supervised sequence-labeling

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Semi-Supervised Learning

Semi-Supervised Learning
On information regularization

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence

Part-of-speech tagging from 97% to 100%: is it time for some linguistics?

CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part I
Unsupervised part-of-speech tagging with bilingual graph-based projections

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Semi-supervised frame-semantic parsing for unknown predicates

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Semi-supervised multi-task learning of structured prediction models for web information extraction

Proceedings of the 20th ACM international conference on Information and knowledge management
Unsupervised semantic role induction with graph partitioning

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Graph-based lexicon expansion with sparsity-inducing penalties

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Graph-based semi-supervised learning algorithms for NLP

ACL '12 Tutorial Abstracts of ACL 2012
Bootstrapping via graph propagation

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
A cost sensitive part-of-speech tagging: differentiating serious errors from minor errors

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Improved parsing and POS tagging using inter-sentence consistency constraints

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Semi-supervised multiple instance learning based domain adaptation for object detection

Proceedings of the Eighth Indian Conference on Computer Vision, Graphics and Image Processing
Wikipedia entity expansion and attribute extraction from the web using semi-supervised learning

Proceedings of the sixth ACM international conference on Web search and data mining
Full Length Article: Simulated annealing based classifier ensemble techniques: Application to part of speech tagging

Information Fusion

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe a new scalable algorithm for semi-supervised training of conditional random fields (CRF) and its application to part-of-speech (POS) tagging. The algorithm uses a similarity graph to encourage similar n-grams to have similar POS tags. We demonstrate the efficacy of our approach on a domain adaptation task, where we assume that we have access to large amounts of unlabeled data from the target domain, but no additional labeled data. The similarity graph is used during training to smooth the state posteriors on the target domain. Standard inference can be used at test time. Our approach is able to scale to very large problems and yields significantly improved target domain accuracy.