Forward semi-supervised feature selection

  • Authors:
  • Jiangtao Ren;Zhengyuan Qiu;Wei Fan;Hong Cheng;Philip S. Yu

  • Affiliations:
  • Department of Computer Science, Sun Yat-Sen University, Guangzhou, China;Department of Computer Science, Sun Yat-Sen University, Guangzhou, China;IBM T.J.Watson Research;Computer Science Department, UIUC;Computer Science, University of Illinois at Chicago

  • Venue:
  • PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Traditionally, feature selection methods work directly on labeled examples. However, the availability of labeled examples cannot be taken for granted for many real world applications, such as medical diagnosis, forensic science, fraud detection, etc, where labeled examples are hard to find. This practical problem calls the need for "semi-supervised feature selection" to choose the optimal set of features given both labeled and unlabeled examples that return the most accurate classifier for a learning algorithm. In this paper, we introduce a "wrapper-type" forward semi-supervised feature selection framework. In essence, it uses unlabeled examples to extend the initial labeled training set. Extensive experiments on publicly available datasets shows that our proposed framework, generally, outperforms both traditional supervised and state of-the-art "filter-type" semi-supervised feature selection algorithms [5] by 1% to 10% in accuracy.