Dimensionality reduction via sparse support vector machines

Authors:
Jinbo Bi;Kristin Bennett;Mark Embrechts;Curt Breneman;Minghu Song
Affiliations:
Department of Mathematical Sciences, Rensselaer Polytechnic Institute, Troy, NY;Department of Mathematical Sciences, Rensselaer Polytechnic Institute, Troy, NY;Department of Decision Science and Engineering Systems, Rensselaer Polytechnic Institute, Troy, NY;Department of Chemistry, Rensselaer Polytechnic Institute, Troy, NY;Department of Chemistry, Rensselaer Polytechnic Institute, Troy, NY
Venue:
The Journal of Machine Learning Research
Year:
2003

Citing 11
Cited 59

A training algorithm for optimal margin classifiers

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
The nature of statistical learning theory

The nature of statistical learning theory
Bagging predictors

Machine Learning
Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
Combining support vector and mathematical programming methods for classification

Advances in kernel methods
Parsimonious Least Norm Approximation

Computational Optimization and Applications
Prediction games and arcing algorithms

Neural Computation
Information Visualization in Data Mining and Knowledge Discovery

Information Visualization in Data Mining and Knowledge Discovery
Gene Selection for Cancer Classification using Support Vector Machines

Machine Learning
Ranking a random feature for variable and feature selection

The Journal of Machine Learning Research
New Support Vector Algorithms

Neural Computation

An introduction to variable and feature selection

The Journal of Machine Learning Research
A Feature Selection Newton Method for Support Vector Machine Classification

Computational Optimization and Applications
A content-based image retrieval system for fish taxonomy

Proceedings of the 7th ACM SIGMM international workshop on Multimedia information retrieval
A Computational Framework for Taxonomic Research: Diagnosing Body Shape within Fish Species Complexes

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Non-parametric classifier-independent feature selection

Pattern Recognition
An Artificial Neural Network model for mountainous water-resources management: The case of Cyprus mountainous watersheds

Environmental Modelling & Software
Analysis of SVM regression bounds for variable ranking

Neurocomputing
Direct convex relaxations of sparse SVM

Proceedings of the 24th international conference on Machine learning
Predictor output sensitivity and feature similarity-based feature selection

Fuzzy Sets and Systems
A three-stage framework for gene expression data analysis by L1-norm support vector regression

International Journal of Bioinformatics Research and Applications
Hierarchical fuzzy filter method for unsupervised feature selection

Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology
An Information Criterion for Variable Selection in Support Vector Machines

The Journal of Machine Learning Research
Sequential input selection algorithm for long-term prediction of time series

Neurocomputing
Kernel discriminant analysis based feature selection

Neurocomputing
Feature selection using localized generalization error for supervised classification problems using RBFNN

Pattern Recognition
Visual Methods for Examining SVM Classifiers

Visual Data Mining
Classification model selection via bilevel programming

Optimization Methods & Software - Mathematical programming in data mining and machine learning
Combined input variable selection and model complexity control for nonlinear regression

Pattern Recognition Letters
Boosting selection of speech related features to improve performance of multi-class SVMs in emotion detection

Expert Systems with Applications: An International Journal
Simultaneous input variable and basis function selection for RBF networks

Neurocomputing
A mathematical programming formulation for sparse collaborative computer aided diagnosis

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Kernel regression with order preferences

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
A decision rule-based method for feature selection in predictive data mining

Expert Systems with Applications: An International Journal
Application of attention network test and demographic information to detect mild cognitive impairment via combining feature selection with support vector machine

Computer Methods and Programs in Biomedicine
IPCM separability ratio for supervised feature selection

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
On the sparseness of 1-norm support vector machines

Neural Networks
Ultrahigh Dimensional Feature Selection: Beyond The Linear Model

The Journal of Machine Learning Research
Input selection for radial basis function networks by constrained optimization

ICANN'07 Proceedings of the 17th international conference on Artificial neural networks
Discriminant analysis via support vectors

Neurocomputing
Unsupervised feature selection for multi-cluster data

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
SVM-FuzCoC: A novel SVM-based feature selection method using a fuzzy complementary criterion

Pattern Recognition
Multi-model classification method in heterogeneous image databases

Pattern Recognition
Sparse ensembles using weighted combination methods based on linear programming

Pattern Recognition
Tournament searching method to feature selection problem

ICAISC'10 Proceedings of the 10th international conference on Artifical intelligence and soft computing: Part II
Mining concept similarities for heterogeneous ontologies

ICDM'10 Proceedings of the 10th industrial conference on Advances in data mining: applications and theoretical aspects
The support feature machine for classifying with the least number of features

ICANN'10 Proceedings of the 20th international conference on Artificial neural networks: Part II
Information-theoretic approaches to SVM feature selection for metagenome read classification

Computational Biology and Chemistry
Feature selection using support vector machines and bootstrap methods for ventricular fibrillation detection

Expert Systems with Applications: An International Journal
Case-based reasoning ensemble and business application: A computational approach from multiple case representations driven by randomness

Expert Systems with Applications: An International Journal
Feature selection based on kernel discriminant analysis

ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part II
Find the intrinsic space for multiclass classification

Proceedings of the 4th International Symposium on Applied Sciences in Biomedical and Communication Technologies
1-norm support vector machine for college drinking risk factor identification

Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
Kernel basis pursuit

ECML'05 Proceedings of the 16th European conference on Machine Learning
Margin-sparsity trade-off for the set covering machine

ECML'05 Proceedings of the 16th European conference on Machine Learning
Dimension reduction vs. variable selection

PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
Feature selection with RVM and its application to prediction modeling

AI'06 Proceedings of the 19th Australian joint conference on Artificial Intelligence: advances in Artificial Intelligence
Feature selection for dimensionality reduction

SLSFS'05 Proceedings of the 2005 international conference on Subspace, Latent Structure and Feature Selection
Biomarker discovery using 1-norm regularization for multiclass earthworm microarray gene expression data

Neurocomputing
Online feature selection for mining big data

Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
Feature selection by block addition and block deletion

ANNPR'12 Proceedings of the 5th INNS IAPR TC 3 GIRPR conference on Artificial Neural Networks in Pattern Recognition
Learning Using Privileged Information with L-1 Support Vector Machine

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 03
An iterative SVM approach to feature selection and classification in high-dimensional datasets

Pattern Recognition
Feature selection using misclassification counts

AusDM '11 Proceedings of the Ninth Australasian Data Mining Conference - Volume 121
A general model for continuous noninvasive pulmonary artery pressure estimation

Computers in Biology and Medicine
A machine learning approach to college drinking prediction and risk factor identification

ACM Transactions on Intelligent Systems and Technology (TIST) - Survey papers, special sections on the semantic adaptive social web, intelligent systems for health informatics, regular papers
A fast algorithm for kernel 1-norm support vector machines

Knowledge-Based Systems
From taxi GPS traces to social and community dynamics: A survey

ACM Computing Surveys (CSUR)
Analysis of programming properties and the row-column generation method for 1-norm support vector machines

Neural Networks
Sparse semi-supervised learning on low-rank kernel

Neurocomputing

Quantified Score

Hi-index	0.01

Visualization

Abstract

We describe a methodology for performing variable ranking and selection using support vector machines (SVMs). The method constructs a series of sparse linear SVMs to generate linear models that can generalize well, and uses a subset of nonzero weighted variables found by the linear models to produce a final nonlinear model. The method exploits the fact that a linear SVM (no kernels) with l1-norm regularization inherently performs variable selection as a side-effect of minimizing capacity of the SVM model. The distribution of the linear model weights provides a mechanism for ranking and interpreting the effects of variables. Starplots are used to visualize the magnitude and variance of the weights for each variable. We illustrate the effectiveness of the methodology on synthetic data, benchmark problems, and challenging regression problems in drug design. This method can dramatically reduce the number of variables and outperforms SVMs trained using all attributes and using the attributes selected according to correlation coefficients. The visualization of the resulting models is useful for understanding the role of underlying variables.