Discriminative semi-supervised feature selection via manifold regularization

  • Authors:
  • Zenglin Xu;Irwin King;Michael Rung-Tsang Lyu;Rang Jin

  • Affiliations:
  • Cluster of Excellence, Saarland University, Max Planck Institute for Informatics, Saarbruecken, Germany;Department of Computer Science and Engineering, Chinese University of Hong Kong, Shatin, Hong Kong;Department of Computer Science and Engineering, Chinese University of Hong Kong, Shatin, Hong Kong;Department of Computer Science and Engineering, Michigan State University, East Lansing, MI

  • Venue:
  • IEEE Transactions on Neural Networks
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Feature selection has attracted a huge amount of interest in both research and application communities of data mining. We consider the problem of semi-supervised feature selection, where we are given a small amount of labeled examples and a large amount of unlabeled examples. Since a small number of labeled samples are usually insufficient for identifying the relevant features, the critical problem arising from semi-supervised feature selection is how to take advantage of the information underneath the unlabeled data. To address this problem, we propose a novel discriminative semi-supervised feature selection method based on the idea of manifold regularization. The proposed approach selects features through maximizing the classification margin between different classes and simnltaneously exploiting the geometry of the probability distribution that generates both labeled and unlabeled data. In comparison with previous semi supervised feature selection algorithms, our proposed semi-supervised feature selection method is an embedded feature selection method and is able to find more discriminative features. We formulate the proposed feature selection method into a convex-concave optimization problem, where the saddle point corresponds to the optimal solution. To find the optimal solution, the level method, a fairly recent optimization method, is employed. We also present a theoretic proof of the convergence rate for the application of the level method to our problem. Empirical evaluation on several benchmark data sets demonstrates the effectiveness of the proposed semi-supervised feature selection method.