Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
An explanation of ordinal optimization: soft computing for hard problems
Information Sciences: an International Journal
An introduction to variable and feature selection
The Journal of Machine Learning Research
Overfitting in making comparisons between variable selection methods
The Journal of Machine Learning Research
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Convex Optimization
Fast Branch & Bound Algorithms for Optimal Feature Selection
IEEE Transactions on Pattern Analysis and Machine Intelligence
Hybrid Genetic Algorithms for Feature Selection
IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE Transactions on Pattern Analysis and Machine Intelligence
Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
Ordinal Optimization: Soft Computing for Hard Problems (International Series on Discrete Event Dynamic Systems)
Computational Methods of Feature Selection (Chapman & Hall/Crc Data Mining and Knowledge Discovery Series)
Spectral feature selection for supervised and unsupervised learning
Proceedings of the 24th international conference on Machine learning
Trace ratio criterion for feature selection
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
On the Complexity of Discrete Feature Selection for Optimal Classification
IEEE Transactions on Pattern Analysis and Machine Intelligence
Unsupervised feature selection for multi-cluster data
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Discriminative semi-supervised feature selection via manifold regularization
IEEE Transactions on Neural Networks
Eigenvector sensitive feature selection for spectral clustering
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
Structured Variable Selection with Sparsity-Inducing Norms
The Journal of Machine Learning Research
Spectral Feature Selection for Data Mining
Spectral Feature Selection for Data Mining
l2,1-norm regularized discriminative feature selection for unsupervised learning
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Hi-index | 0.00 |
The ultimate goal of supervised feature selection is to select a feature subset such that the prediction accuracy is maximized. Most of the existing methods, such as filter and embedded models, can be viewed as using approximate objectives which are different from the prediction accuracy. The wrapper models maximize the prediction accuracy directly, but the optimization has very high computational complexity. To address the limitations, we present an ordinal optimization perspective for feature selection (OOFS). Feature subset evaluation is formulated as a system simulation process with randomness. Supervised feature selection becomes maximizing the expected performance of the system, where ordinal optimization can be applied to identify a set of order-good-enough solutions with much reduced complexity and parallel computing. These solutions correspond to the really good enough (value-good-enough) solutions when the solution space structure, characterized by ordered performance curve (OPC), exhibits concave shapes. We analyze that this happens in some important applications such as image classification, where a large number features have relatively similar abilities in discrimination. We further improve the OOFS method with a feature scoring algorithm, called OOFSs. We prove that, when the performance difference of solutions increases monotonically with respect to the solution difference, the expectation of the scores reflects useful information for estimating the globally optimal solution. Experimental results in sixteen real-world datasets show that our method provides a good trade-off between prediction accuracy and computational complexity.