Unsupervised feature construction for improving data representation and semantics
Journal of Intelligent Information Systems
Hi-index | 0.00 |
Feature selection is a critical preprocessing step in machine learning. It contributes to cost-effective model building and improvement of model prediction performance. Generally, a feature selection algorithm requires a dependency measure and a search strategy. Extant dependency measures are mostly based on pair-wise correlation analysis, which cannot detect feature interaction. To overcome this problem, we developed a unified dependency criterion called inference correlation. The inference correlation between a set of predictor variables and a response variable can be efficiently calculated. The variables could be discrete, continuous, or mixed. Therefore, inference correlation can be applied to select features for both classification and regression problems. A feature selection algorithm using sequential floating forward search based on inference correlation is presented. Experiments of the algorithm on synthetic datasets and real-world problems confirm the effectiveness of the feature selection approach when compared to extant feature selection methods.