Cross-Domain Knowledge Transfer Using Semi-supervised Classification

  • Authors:
  • Yi Zhen;Chunping Li

  • Affiliations:
  • Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, China;School of Software, Tsinghua University, Beijing, China 100084

  • Venue:
  • AI '08 Proceedings of the 21st Australasian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Traditional text classification algorithms are based on a basic assumption: the training and test data should hold the same distribution. However, this identical distribution assumption is always violated in real applications. Due to the distribution of test data from target domain and the distribution of training data from auxiliary domain are different, we call this classification problem cross-domain classification. Although most of the training data are drawn from auxiliary domain, we still can obtain a few training data drawn from target domain. To solve the cross-domain classification problem in this situation, we propose a two-stage algorithm which is based on semi-supervised classification. We firstly utilizes labeled data in target domain to filter the support vectors of the auxiliary domain, then uses filtered data and labeled data from target domain to construct a classifier for the target domain. The experimental evaluation on real-world text classification problems demonstrates encouraging results and validates our approach.