Statistical Pattern Recognition: A Review
IEEE Transactions on Pattern Analysis and Machine Intelligence
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
Feature Subset Selection and Order Identification for Unsupervised Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
An introduction to variable and feature selection
The Journal of Machine Learning Research
Unsupervised Gene Selection For High Dimensional Data
BIBE '06 Proceedings of the Sixth IEEE Symposium on BionInformatics and BioEngineering
Genetic algorithm-based feature selection in high-resolution NMR spectra
Expert Systems with Applications: An International Journal
Controlling the False Discovery Rate for Feature Selection in High-resolution NMR Spectra
Statistical Analysis and Data Mining
Identifying critical variables of principal components for unsupervised feature selection
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Hi-index | 12.05 |
Feature selection has received considerable attention in various areas as a way to select informative features and to simplify the statistical model through dimensional reduction. One of the most widely used methods for dimensional reduction includes principal component analysis (PCA). Despite its popularity, PCA suffers from a lack of interpretability of the original feature because the reduced dimensions are linear combinations of a large number of original features. Traditionally, two or three dimensional loading plots provide information to identify important original features in the first few principal component dimensions. However, the interpretation of what constitutes a loading plot is frequently subjective, particularly when large numbers of features are involved. In this study, we propose an unsupervised feature selection method that combines weighted principal components (PCs) with a thresholding algorithm. The weighted PC is obtained by the weighted sum of the first k PCs of interest. Each of the k loading values in the weighted PC reflects the contribution of each individual feature. We also propose a thresholding algorithm that identifies the significant features. Our experimental results with both the simulated and real datasets demonstrated the effectiveness of the proposed unsupervised feature selection method.