Estimating mutual information for feature selection in the presence of label noise

Authors:
Benoít Frénay;Gauthier Doquire;Michel Verleysen
Affiliations:
-;-;-
Venue:
Computational Statistics & Data Analysis
Year:
2014

Citing 22
Cited 1

Discovering informative patterns and data cleaning

Advances in knowledge discovery and data mining
An Algorithm for Finding Best Matches in Logarithmic Expected Time

ACM Transactions on Mathematical Software (TOMS)
Multidimensional binary search trees used for associative searching

Communications of the ACM
Gene Selection for Cancer Classification using Support Vector Machines

Machine Learning
Estimating a Kernel Fisher Discriminant in the Presence of Label Noise

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Decontamination of Training Samples for Supervised Pattern Recognition Methods

Proceedings of the Joint IAPR International Workshops on Advances in Pattern Recognition
An introduction to variable and feature selection

The Journal of Machine Learning Research
A method for predicting disease subtypes in presence of misclassification among training samples using gene expression: application to human breast cancer

Bioinformatics
Resampling methods for parameter-free and robust feature selection with mutual information

Neurocomputing
Classification in the presence of class noise using a probabilistic Kernel Fisher method

Pattern Recognition
Learning from partially supervised data using mixture models and belief functions

Pattern Recognition
Modified linear discriminant analysis approaches for classification of high-dimensional microarray data

Computational Statistics & Data Analysis
Machine Learning and Data Mining: Introduction to Principles and Algorithms

Machine Learning and Data Mining: Introduction to Principles and Algorithms
Robust supervised classification with mixture models: Learning from data with uncertain labels

Pattern Recognition
Information-theoretic feature selection for functional data classification

Neurocomputing
Feature Selection by Transfer Learning with Linear Regularized Models

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Bayesian analysis of correlated misclassified binary data

Computational Statistics & Data Analysis
Variable selection via combined penalization for high-dimensional data analysis

Computational Statistics & Data Analysis
On the assessment of text corpora

NLDB'09 Proceedings of the 14th international conference on Applications of Natural Language to Information Systems
Nearest neighbor pattern classification

IEEE Transactions on Information Theory
Using mutual information for selecting features in supervised neural net learning

IEEE Transactions on Neural Networks
On selecting interacting features from high-dimensional data

Computational Statistics & Data Analysis

Editorial: Special issue on imprecision in statistical data analysis

Computational Statistics & Data Analysis

Quantified Score

Hi-index	0.03

Visualization

Abstract

A way to achieve feature selection for classification problems polluted by label noise is proposed. The performances of traditional feature selection algorithms often decrease sharply when some samples are wrongly labelled. A method based on a probabilistic label noise model combined with a nearest neighbours-based entropy estimator is introduced to robustly evaluate the mutual information, a popular relevance criterion for feature selection. A backward greedy search procedure is used in combination with this criterion to find relevant sets of features. Experiments establish that (i) there is a real need to take a possible label noise into account when selecting features and (ii) the proposed methodology is effectively able to reduce the negative impact of the mislabelled data points on the feature selection process.