Supervised self-taught learning: actively transferring knowledge from unlabeled data

  • Authors:
  • Kaizhu Huang;Zenglin Xu;Irwin King;Michael R. Lyu;Colin Campbell

  • Affiliations:
  • Department of Engineering Mathematics, University of Bristol, Bristol, United Kingdom;Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong;Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong;Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong;Department of Engineering Mathematics, University of Bristol, Bristol, United Kingdom

  • Venue:
  • IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the task of Self-taught Learning (STL) from unlabeled data. In contrast to semi-supervised learning, which requires unlabeled data to have the same set of class labels as labeled data, STL can transfer knowledge from different types of unlabeled data. STL uses a three-step strategy: (1) learning high-level representations from unlabeled data only, (2) re-constructing the labeled data via such representations and (3) building a classifier over the re-constructed labeled data. However, the high-level representations which are exclusively determined by the unlabeled data, may be inappropriate or even misleading for the latter classifier training process. In this paper, we propose a novel Supervised Self-taught Learning (SSTL) framework that successfully integrates the three isolated steps of STL into a single optimization problem. Benefiting from the interaction between the classifier optimization and the process of choosing high-level representations, the proposed model is able to select those discriminative representations which are more appropriate for classification. One important feature of our novel framework is that the final optimization can be iteratively solved with convergence guaranteed. We evaluate our novel framework on various data sets. The experimental results show that the proposed SSTL can outperform STL and traditional supervised learning methods in certain instances.