We're not in Kansas anymore: detecting domain changes in streams

  • Authors:
  • Mark Dredze;Tim Oates;Christine Piatko

  • Affiliations:
  • Human Language Technology Center of Excellence and University of Maryland, Baltimore County;Human Language Technology Center of Excellence and University of Maryland, Baltimore County;Human Language Technology Center of Excellence and Johns Hopkins University

  • Venue:
  • EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

Domain adaptation, the problem of adapting a natural language processing system trained in one domain to perform well in a different domain, has received significant attention. This paper addresses an important problem for deployed systems that has received little attention - detecting when such adaptation is needed by a system operating in the wild, i.e., performing classification over a stream of unlabeled examples. Our method uses A-distance, a metric for detecting shifts in data streams, combined with classification margins to detect domain shifts. We empirically show effective domain shift detection on a variety of data sets and shift conditions.