Triplex transfer learning: exploiting both shared and distinct concepts for text classification

  • Authors:
  • Fuzhen Zhuang;Ping Luo;Changying Du;Qing He;Zhongzhi Shi

  • Affiliations:
  • Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China;Hewlett-Packard Labs, China, Beijing, China;Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China;Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China;Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China

  • Venue:
  • Proceedings of the sixth ACM international conference on Web search and data mining
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Transfer learning focuses on the learning scenarios when the test data from target domains and the training data from source domains are drawn from similar but different data distribution with respect to the raw features. Some recent studies argued that the high-level concepts (e.g. word clusters) can help model the data distribution difference, and thus are more appropriate for classification. Specifically, these methods assume that all the data domains have the same set of shared concepts, which are used as the bridge for knowledge transfer. However, besides these shared concepts each domain may have its own distinct concepts. To address this point, we propose a general transfer learning framework based on non-negative matrix tri-factorization which allows to explore both shared and distinct concepts among all the domains simultaneously. Since this model provides more flexibility in fitting the data it may lead to better classification accuracy. To solve the proposed optimization problem we develop an iterative algorithm and also theoretically analyze its convergence. Finally, extensive experiments show the significant superiority of our model over the baseline methods. In particular, we show that our method works much better in the more challenging tasks when distinct concepts may exist.