Active dual supervision: reducing the cost of annotating examples and features

  • Authors:
  • Prem Melville;Vikas Sindhwani

  • Affiliations:
  • IBM T.J. Watson Research Center, Yorktown Heights, NY;IBM T.J. Watson Research Center, Yorktown Heights, NY

  • Venue:
  • HLT '09 Proceedings of the NAACL HLT 2009 Workshop on Active Learning for Natural Language Processing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

When faced with the task of building machine learning or NLP models, it is often worthwhile to turn to active learning to obtain human annotations at minimal costs. Traditional active learning schemes query a human for labels of intelligently chosen examples. However, human effort can also be expended in collecting alternative forms of annotations. For example, one may attempt to learn a text classifier by labeling class-indicating words, instead of, or in addition to, documents. Learning from two different kinds of supervision brings a new, unexplored dimension to the problem of active learning. In this paper, we demonstrate the value of such active dual supervision in the context of sentiment analysis. We show how interleaving queries for both documents and words significantly reduces human effort -- more than what is possible through traditional one-dimensional active learning, or by passive combinations of supervisory inputs.