Similarity-Based Feature Selection for Learning from Examples with Continuous Values

  • Authors:
  • Yun Li;Su-Jun Hu;Wen-Jie Yang;Guo-Zi Sun;Fang-Wu Yao;Geng Yang

  • Affiliations:
  • College of Computer, Nanjing University of Posts and Telecommunications, Nanjing, P.R. China 210003;College of Computer, Nanjing University of Posts and Telecommunications, Nanjing, P.R. China 210003;College of Computer, Nanjing University of Posts and Telecommunications, Nanjing, P.R. China 210003;College of Computer, Nanjing University of Posts and Telecommunications, Nanjing, P.R. China 210003;College of Computer, Nanjing University of Posts and Telecommunications, Nanjing, P.R. China 210003;College of Computer, Nanjing University of Posts and Telecommunications, Nanjing, P.R. China 210003

  • Venue:
  • PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In many real world problems, such as machine learning and data mining, feature selection is often used to choose a small subset of features which is sufficient to predict the target labels well. In this paper, we will propose a feature selection algorithm based on similarity and extension matrix. Extension matrix is an important theory in learning from examples and it is originally designed to deal with discrete feature values. However, in the paper it is extended to cope with continuous values and designed as search strategy. The evaluation criterion for feature selection is based on the similarity between classes, which is obtained from the similarity between examples in different classes using min-max learning rule. The algorithm is proved in theory and shown its higher performance than two other classic general algorithms over several real-world benchmark data sets and facial image sets with different poses and expressions for gender classification.