Semi-supervised learning for natural language processing

Authors:
John Blitzer;Xiaojin Jerry Zhu
Affiliations:
Microsoft Research Asia, Beijing, China;University of Wisconsin, Madison, WI
Venue:
HLT-Tutorials '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Tutorial Abstracts
Year:
2008

Citing 3
Cited 0

Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data

The Journal of Machine Learning Research
Prototype-driven learning for sequence models

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

The amount of unlabeled linguistic data available to us is much larger and growing much faster than the amount of labeled data. Semi-supervised learning algorithms combine unlabeled data with a small labeled training set to train better models. This tutorial emphasizes practical applications of semisupervised learning; we treat semi-supervised learning methods as tools for building effective models from limited training data. An attendee will leave our tutorial with 1. A basic knowledge of the most common classes of semi-supervised learning algorithms and where they have been used in NLP before. 2. The ability to decide which class will be useful in her research. 3. Suggestions against potential pitfalls in semisupervised learning.