Local-Learning-Based Feature Selection for High-Dimensional Data Analysis

Authors:
Yijun Sun;Sinisa Todorovic;Steve Goodison
Affiliations:
University of Florida, Gainesville;Oregon State University, Corvallis;M.D. Anderson Cancer Center-Orlando, Orlando
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
2010

Citing 0
Cited 21

Network-based classification using cortical thickness of AD patients

MLMI'11 Proceedings of the Second international conference on Machine learning in medical imaging
ProClusEnsem: Predicting membrane protein types by fusing different modes of pseudo amino acid composition

Computers in Biology and Medicine
Large-margin feature selection for monotonic classification

Knowledge-Based Systems
Easy-to-explain feature synthesis approach for recommending entertainment video

Neurocomputing
Wrappers for web access logs feature selection

Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics
A boosting approach for supervised Mahalanobis distance metric learning

Pattern Recognition
Performance evaluation of multilayer perceptrons for discriminating and quantifying multiple kinds of odors with an electronic nose

Neural Networks
Feature weighting by RELIEF based on local hyperplane approximation

PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
The appearance of the giant component in descriptor graphs and its application for descriptor selection

CLEF'12 Proceedings of the Third international conference on Information Access Evaluation: multilinguality, multimodality, and visual analytics
Minimum-maximum local structure information for feature selection

Pattern Recognition Letters
A local information-based feature-selection algorithm for data regression

Pattern Recognition
Large Margin Subspace Learning for feature selection

Pattern Recognition
A semi-supervised feature selection method using a non-parametric technique with pairwise instance constraints

Journal of Information Science
Resampling methods for quality assessment of classifier performance and optimal number of features

Signal Processing
Spatial distance join based feature selection

Engineering Applications of Artificial Intelligence
Feature selection for high-dimensional multi-category data using PLS-based local recursive feature elimination

Expert Systems with Applications: An International Journal
Accurate prediction of AD patients using cortical thickness networks

Machine Vision and Applications
PLS-based recursive feature elimination for high-dimensional small sample

Knowledge-Based Systems
Joint Laplacian feature weights learning

Pattern Recognition
Locality and similarity preserving embedding for feature selection

Neurocomputing
New automated power quality recognition system for online/offline monitoring

Neurocomputing

Quantified Score

Hi-index	0.15

Visualization

Abstract

This paper considers feature selection for data classification in the presence of a huge number of irrelevant features. We propose a new feature-selection algorithm that addresses several major issues with prior work, including problems with algorithm implementation, computational complexity, and solution accuracy. The key idea is to decompose an arbitrarily complex nonlinear problem into a set of locally linear ones through local learning, and then learn feature relevance globally within the large margin framework. The proposed algorithm is based on well-established machine learning and numerical analysis techniques, without making any assumptions about the underlying data distribution. It is capable of processing many thousands of features within minutes on a personal computer while maintaining a very high accuracy that is nearly insensitive to a growing number of irrelevant features. Theoretical analyses of the algorithm's sample complexity suggest that the algorithm has a logarithmical sample complexity with respect to the number of features. Experiments on 11 synthetic and real-world data sets demonstrate the viability of our formulation of the feature-selection problem for supervised learning and the effectiveness of our algorithm.