Speeding Up Feature Subset Selection Through Mutual Information Relevance Filtering

Authors:
Gert Dijck;Marc M. Hulle
Affiliations:
Katholieke Universiteit Leuven, Computational Neuroscience Research Group, bus 1021, B-3000 Leuven, Belgium;Katholieke Universiteit Leuven, Computational Neuroscience Research Group, bus 1021, B-3000 Leuven, Belgium
Venue:
PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
Year:
2007

Citing 9
Cited 0

A note on genetic algorithms for large-scale feature selection

Pattern Recognition Letters
Floating search methods in feature selection

Pattern Recognition Letters
Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss

Machine Learning - Special issue on learning with probabilistic representations
Estimation of entropy and mutual information

Neural Computation
Hybrid Genetic Algorithms for Feature Selection

IEEE Transactions on Pattern Analysis and Machine Intelligence
Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)

Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
A Branch and Bound Algorithm for Feature Subset Selection

IEEE Transactions on Computers
Speeding up the wrapper feature subset selection in regression by mutual information relevance and redundancy analysis

ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

A relevance filter is proposed which removes features based on the mutual information between class labels and features. It is proven that both feature independence and class conditional feature independence are required for the filter to be statistically optimal. This could be shown by establishing a relationship with the conditional relative entropy framework for feature selection. Removing features at various significance levels as a preprocessing step to sequential forward search leads to a huge increase in speed, without a decrease in classification accuracy. These results are shown based on experiments with 5 high-dimensional publicly available gene expression data sets.