Supervised feature subset selection with ordinal optimization

Authors:
Dingcheng Feng;Feng Chen;Wenli Xu
Affiliations:
-;-;-
Venue:
Knowledge-Based Systems
Year:
2014

Citing 23
Cited 0

Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
An explanation of ordinal optimization: soft computing for hard problems

Information Sciences: an International Journal
An introduction to variable and feature selection

The Journal of Machine Learning Research
Overfitting in making comparisons between variable selection methods

The Journal of Machine Learning Research
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Convex Optimization

Convex Optimization
Fast Branch & Bound Algorithms for Optimal Feature Selection

IEEE Transactions on Pattern Analysis and Machine Intelligence
Hybrid Genetic Algorithms for Feature Selection

IEEE Transactions on Pattern Analysis and Machine Intelligence
Feature Selection Based on Mutual Information: Criteria of Max-Dependency, Max-Relevance, and Min-Redundancy

IEEE Transactions on Pattern Analysis and Machine Intelligence
Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)

Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
Ordinal Optimization: Soft Computing for Hard Problems (International Series on Discrete Event Dynamic Systems)

Ordinal Optimization: Soft Computing for Hard Problems (International Series on Discrete Event Dynamic Systems)
Computational Methods of Feature Selection (Chapman & Hall/Crc Data Mining and Knowledge Discovery Series)

Computational Methods of Feature Selection (Chapman & Hall/Crc Data Mining and Knowledge Discovery Series)
Spectral feature selection for supervised and unsupervised learning

Proceedings of the 24th international conference on Machine learning
Feature selection via sensitivity analysis of SVM probabilistic outputs

Machine Learning
Trace ratio criterion for feature selection

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
A novel Markov boundary based feature subset selection algorithm

Neurocomputing
On the Complexity of Discrete Feature Selection for Optimal Classification

IEEE Transactions on Pattern Analysis and Machine Intelligence
Unsupervised feature selection for multi-cluster data

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Discriminative semi-supervised feature selection via manifold regularization

IEEE Transactions on Neural Networks
Eigenvector sensitive feature selection for spectral clustering

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
Structured Variable Selection with Sparsity-Inducing Norms

The Journal of Machine Learning Research
Spectral Feature Selection for Data Mining

Spectral Feature Selection for Data Mining
l2,1-norm regularized discriminative feature selection for unsupervised learning

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two

Quantified Score

Hi-index	0.00

Visualization

Abstract

The ultimate goal of supervised feature selection is to select a feature subset such that the prediction accuracy is maximized. Most of the existing methods, such as filter and embedded models, can be viewed as using approximate objectives which are different from the prediction accuracy. The wrapper models maximize the prediction accuracy directly, but the optimization has very high computational complexity. To address the limitations, we present an ordinal optimization perspective for feature selection (OOFS). Feature subset evaluation is formulated as a system simulation process with randomness. Supervised feature selection becomes maximizing the expected performance of the system, where ordinal optimization can be applied to identify a set of order-good-enough solutions with much reduced complexity and parallel computing. These solutions correspond to the really good enough (value-good-enough) solutions when the solution space structure, characterized by ordered performance curve (OPC), exhibits concave shapes. We analyze that this happens in some important applications such as image classification, where a large number features have relatively similar abilities in discrimination. We further improve the OOFS method with a feature scoring algorithm, called OOFSs. We prove that, when the performance difference of solutions increases monotonically with respect to the solution difference, the expectation of the scores reflects useful information for estimating the globally optimal solution. Experimental results in sixteen real-world datasets show that our method provides a good trade-off between prediction accuracy and computational complexity.