Iterative Reinforcement Cross-Domain Text Classification

  • Authors:
  • Di Zhang;Gui-Rong Xue;Yong Yu

  • Affiliations:
  • Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China 200240;Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China 200240;Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China 200240

  • Venue:
  • ADMA '08 Proceedings of the 4th international conference on Advanced Data Mining and Applications
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

Traditional text classification techniques are based on a basic assumption that the underlying distributions of training and test data should be identical. However, in many real world applications, this assumption is not often satisfied. Labeled training data are expensive, but there may be some labeled data available in a different but related domain from test data. Therefore, how to make use of labeled data from a different domain to supervise the classification becomes a crucial task. In this paper, we propose a novel algorithm for cross-domain text classification using reinforcement learning. In our algorithm, the training process is iteratively reinforced by making use of the relations between documents and words. Empirically, our method is an effective and scalable approach for text categorization when the training and test data are from different but related domains. The experimental results show that our algorithm can achieve better performance than several state-of-art classifiers.