A stopping criterion for active learning

  • Authors:
  • Andreas Vlachos

  • Affiliations:
  • Computer Laboratory, University of Cambridge, 15 JJ Thomson Avenue, Cambridge, Cambridgeshire CB3 0FD, UK

  • Venue:
  • Computer Speech and Language
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Active learning (AL) is a framework that attempts to reduce the cost of annotating training material for statistical learning methods. While a lot of papers have been presented on applying AL to natural language processing tasks reporting impressive savings, little work has been done on defining a stopping criterion. In this work, we present a stopping criterion for active learning based on the way instances are selected during uncertainty-based sampling and verify its applicability in a variety of settings. The statistical learning models used in our study are support vector machines (SVMs), maximum entropy models and Bayesian logistic regression and the tasks performed are text classification, named entity recognition and shallow parsing. In addition, we present a method for multiclass mutually exclusive SVM active learning.