Feature selection with mutual information for uncertain data

Authors:
Gauthier Doquire;Michel Verleysen
Affiliations:
Université catholique de Louvain, Machine Learning Group - ICTEAM, Louvain-la-Neuve, Belgium;Université catholique de Louvain, Machine Learning Group - ICTEAM, Louvain-la-Neuve, Belgium
Venue:
DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
Year:
2011

Citing 11
Cited 0

An introduction to variable and feature selection

The Journal of Machine Learning Research
Feature Selection Based on Mutual Information: Criteria of Max-Dependency, Max-Relevance, and Min-Redundancy

IEEE Transactions on Pattern Analysis and Machine Intelligence
Hierarchical Density-Based Clustering of Uncertain Data

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Efficient Clustering of Uncertain Data

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Resampling methods for parameter-free and robust feature selection with mutual information

Neurocomputing
Approximation algorithms for clustering uncertain data

Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Clustering Uncertain Data Using Voronoi Diagrams

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
A Survey of Uncertain Data Algorithms and Applications

IEEE Transactions on Knowledge and Data Engineering
Information-theoretic feature selection for functional data classification

Neurocomputing
Naive Bayes Classification of Uncertain Data

ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Decision Trees for Uncertain Data

IEEE Transactions on Knowledge and Data Engineering

Quantified Score

Hi-index	0.02

Visualization

Abstract

In many real-world situations, the data cannot be assumed to be precise. Indeed uncertain data are often encountered, due for example to the imprecision of measurement devices or to continuously moving objects for which the exact position is impossible to obtain. One way to model this uncertainty is to represent each data value as a probability distribution function; recent works show that adequately taking the uncertainty into account generally leads to improved classification performances. Working with such a representation, this paper proposes to achieve feature selection based on mutual information. Experiments on 8 UCI data sets show that the proposed approach is effective to select relevant features.