Semi-supervised learning for natural language processing

  • Authors:
  • John Blitzer;Xiaojin Jerry Zhu

  • Affiliations:
  • Microsoft Research Asia, Beijing, China;University of Wisconsin, Madison, WI

  • Venue:
  • HLT-Tutorials '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Tutorial Abstracts
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The amount of unlabeled linguistic data available to us is much larger and growing much faster than the amount of labeled data. Semi-supervised learning algorithms combine unlabeled data with a small labeled training set to train better models. This tutorial emphasizes practical applications of semisupervised learning; we treat semi-supervised learning methods as tools for building effective models from limited training data. An attendee will leave our tutorial with 1. A basic knowledge of the most common classes of semi-supervised learning algorithms and where they have been used in NLP before. 2. The ability to decide which class will be useful in her research. 3. Suggestions against potential pitfalls in semisupervised learning.