Measuring stability of feature ranking techniques: a noise-based approach

Authors:
Wilker Altidor;Taghi M. Khoshgoftaar;Amri Napolitano
Affiliations:
Department of Computer & Electrical Engineering & Computer Science, Florida Atlantic University, 777 Glades Road, Boca Raton, FL 33431, USA.;Department of Computer & Electrical Engineering & Computer Science, Florida Atlantic University, 777 Glades Road, Boca Raton, FL 33431, USA.;Department of Computer & Electrical Engineering & Computer Science, Florida Atlantic University, 777 Glades Road, Boca Raton, FL 33431, USA
Venue:
International Journal of Business Intelligence and Data Mining
Year:
2012

Citing 29
Cited 0

Estimating attributes: analysis and extensions of RELIEF

ECML-94 Proceedings of the European conference on machine learning on Machine Learning
Selection of relevant features and examples in machine learning

Artificial Intelligence - Special issue on relevance
Efficient algorithms for mining outliers from large data sets

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Filters, Wrappers and a Boosting-Based Hybrid for Feature Selection

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Chi2: Feature Selection and Discretization of Numeric Attributes

TAI '95 Proceedings of the Seventh International Conference on Tools with Artificial Intelligence
An introduction to variable and feature selection

The Journal of Machine Learning Research
An extensive empirical study of feature selection metrics for text classification

The Journal of Machine Learning Research
Benchmarking Attribute Selection Techniques for Discrete Class Data Mining

IEEE Transactions on Knowledge and Data Engineering
Feature selection of intrusion detection data using a hybrid genetic algorithm/KNN approach

Design and application of hybrid intelligent systems
Cost-Guided Class Noise Handling for Effective Cost-Sensitive Learning

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Feature Selection Based on Mutual Information: Criteria of Max-Dependency, Max-Relevance, and Min-Redundancy

IEEE Transactions on Pattern Analysis and Machine Intelligence
Class noise vs. attribute noise: a quantitative study of their impacts

Artificial Intelligence Review
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Stability of feature selection algorithms: a study on high-dimensional spaces

Knowledge and Information Systems
A stability index for feature selection

AIAP'07 Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: artificial intelligence and applications
A review of feature selection techniques in bioinformatics

Bioinformatics
Consensus group stable feature selection

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Knowledge discovery from imbalanced and noisy data

Data & Knowledge Engineering
A Study on the Relationships of Classifier Performance Metrics

ICTAI '09 Proceedings of the 2009 21st IEEE International Conference on Tools with Artificial Intelligence
An Empirical Study on Wrapper-Based Feature Ranking

ICTAI '09 Proceedings of the 2009 21st IEEE International Conference on Tools with Artificial Intelligence
Wrapper-Based Feature Ranking for Software Engineering Metrics

ICMLA '09 Proceedings of the 2009 International Conference on Machine Learning and Applications
Robust biomarker identification for cancer diagnosis with ensemble feature selection methods

Bioinformatics
Improving stability of feature selection methods

CAIP'07 Proceedings of the 12th international conference on Computer analysis of images and patterns
Correlation-based detection of attribute outliers

DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
A Comparative Study of Threshold-Based Feature Selection Techniques

GRC '10 Proceedings of the 2010 IEEE International Conference on Granular Computing
Comparative Analysis of DNA Microarray Data through the Use of Feature Selection Techniques

ICMLA '10 Proceedings of the 2010 Ninth International Conference on Machine Learning and Applications
Using mutual information for selecting features in supervised neural net learning

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

One very common criterion used to evaluate feature selection methods is the performance of a chosen classifier trained with the selected features. Another important evaluation criterion that has, until recently, been neglected is the stability of these feature selection methods. While other studies have shown interest in measuring the degree of agreement between the outputs of a technique trained on randomly selected subsets from the same input data, this study presents the importance of evaluating stability in the presence of noise. Experiments are conducted with 17 filters (six standard filter-based ranking techniques and 11 threshold-based feature selection techniques) on nine different real-world datasets. This paper identifies the techniques that are inherently more sensitive to class noise and demonstrates how certain characteristics (sample size and class imbalance) of the data can affect the stability performance of some feature selection methods.