The complexity and approximability of finding maximum feasible subsystems of linear relations
Theoretical Computer Science
Feature minimization within decision trees
Computational Optimization and Applications
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
Feature Selection via Concave Minimization and Support Vector Machines
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Feature Selection Via Mathematical Programming
INFORMS Journal on Computing
Fast Heuristics for the Maximum Feasible Subsystem Problem
INFORMS Journal on Computing
A Feature Selection Newton Method for Support Vector Machine Classification
Computational Optimization and Applications
Toward Integrating Feature Selection Algorithms for Classification and Clustering
IEEE Transactions on Knowledge and Data Engineering
Exact and Approximate Sparse Solutions of Underdetermined Linear Equations
SIAM Journal on Scientific Computing
Simultaneous feature selection and classification using kernel-penalized support vector machines
Information Sciences: an International Journal
CVPR'03 Proceedings of the 2003 IEEE computer society conference on Computer vision and pattern recognition
Hi-index | 12.05 |
The process of placing a separating hyperplane for data classification is normally disconnected from the process of selecting the features to use. An approach for feature selection that is conceptually simple but computationally explosive is to simply apply the hyperplane placement process to all possible subsets of features, selecting the smallest set of features that provides reasonable classification accuracy. Two ways to speed this process are (i) use a faster filtering criterion instead of a complete hyperplane placement, and (ii) use a greedy forward or backwards sequential selection method. This paper introduces a new filtering criterion that is very fast: maximizing the drop in the sum of infeasibilities in a linear-programming transformation of the problem. It also shows how the linear programming transformation can be applied to reduce the number of features after a separating hyperplane has already been placed while maintaining the separation that was originally induced by the hyperplane. Finally, a new and highly effective integrated method that simultaneously selects features while placing the separating hyperplane is introduced.