The peaking phenomenon in the presence of feature-selection

Authors:
Chao Sima;Edward R. Dougherty
Affiliations:
Computational Biology Division, Translational Genomics Research Institute, Phoenix, AZ 85004, USA;Computational Biology Division, Translational Genomics Research Institute, Phoenix, AZ 85004, USA and Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 7 ...
Venue:
Pattern Recognition Letters
Year:
2008

Citing 13
Cited 7

Small Sample Size Effects in Statistical Pattern Recognition: Recommendations for Practitioners

IEEE Transactions on Pattern Analysis and Machine Intelligence
A practical approach to feature selection

ML92 Proceedings of the ninth international workshop on Machine learning
Floating search methods in feature selection

Pattern Recognition Letters
Feature Selection Based on Mutual Information: Criteria of Max-Dependency, Max-Relevance, and Min-Redundancy

IEEE Transactions on Pattern Analysis and Machine Intelligence
Is cross-validation valid for small-sample microarray classification?

Bioinformatics
Optimal number of features as a function of sample size for various classification rules

Bioinformatics
Feature subset selection bias for classification learning

ICML '06 Proceedings of the 23rd international conference on Machine learning
The ties problem resulting from counting-based error estimators and its impact on gene selection algorithms

Bioinformatics
Determination of the optimal number of features for quadratic discriminant analysis via the normal approximation to the discriminant distribution

Pattern Recognition
Impact of error estimation on feature selection

Pattern Recognition
A Problem of Dimensionality: A Simple Example

IEEE Transactions on Pattern Analysis and Machine Intelligence
Is feature selection still necessary?

SLSFS'05 Proceedings of the 2005 international conference on Subspace, Latent Structure and Feature Selection
On the mean accuracy of statistical pattern recognizers

IEEE Transactions on Information Theory

Gene and sample selection for cancer classification with support vectors based t-statistic

Neurocomputing
Impact of missing value imputation on classification for DNA microarray gene expression data: a model-based study

EURASIP Journal on Bioinformatics and Systems Biology
Robust Feature Selection for Microarray Data Based on Multicriterion Fusion

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
The effect of sampling rate on the performance of template-based gesture recognizers

ICMI '11 Proceedings of the 13th international conference on multimodal interfaces
Disease Liability Prediction from Large Scale Genotyping Data Using Classifiers with a Reject Option

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Optimal classifiers with minimum expected error within a Bayesian framework-Part I: Discrete and Gaussian models

Pattern Recognition
The impact of motion dimensionality and bit cardinality on the design of 3D gesture recognizers

International Journal of Human-Computer Studies

Quantified Score

Hi-index	0.10

Visualization

Abstract

For a fixed sample size, a common phenomenon is that the error of a designed classifier decreases and then increases as the number of features grows. This peaking phenomenon has been recognized for forty years and depends on the classification rule and feature-label distribution. Historically, the peaking phenomenon has been treated by assuming a fixed ordering of the features, usually beginning with the strongest individual feature and proceeding with features of decreasing individual classification capability. This does not take into account feature-selection, which is commonplace in high-dimensional and small sample settings. This paper revisits the peaking phenomenon in the presence of feature-selection. Using massive simulation in a high-performance computing environment, the paper considers various combinations of feature-label models, feature-selection algorithms, and classifier models to produce a large library of error versus feature size curves. Owing to the prevalence of feature-selection in genomic classification, we also consider gene-expression-based classification of breast-cancer patient prognosis. Results vary widely and are strongly dependent on the combination. The error curves tend to fall into three categories: peaking, settling into a plateau, or falling very slowly over a long range of feature set sizes. It can be concluded that one should be wary of applying peaking results found in the absence of feature-selection to settings in which feature-selection is employed.