Domain adaptation with unlabeled data for dialog act tagging

  • Authors:
  • Anna Margolis;Karen Livescu;Mari Ostendorf

  • Affiliations:
  • University of Washington, Seattle, WA and TTI-Chicago, Chicago, IL;TTI-Chicago, Chicago, IL;University of Washington, Seattle, WA

  • Venue:
  • DANLP 2010 Proceedings of the 2010 Workshop on Domain Adaptation for Natural Language Processing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We investigate the classification of utterances into high-level dialog act categories using word-based features, under conditions where the train and test data differ by genre and/or language. We handle the cross-language cases with machine translation of the test utterances. We analyze and compare two feature-based approaches to using unlabeled data in adaptation: restriction to a shared feature set, and an implementation of Blitzer et al. 's Structural Correspondence Learning. Both methods lead to increased detection of backchannels in the cross-language cases by utilizing correlations between backchannel words and utterance length.