Distribution of mutual information for robust feature selection

Authors:
Marcus Hutter;Marco Zaffalon
Affiliations:
-;-
Venue:
Distribution of mutual information for robust feature selection
Year:
2002

Citing 0
Cited 2

Robust feature selection by mutual information distributions

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Statistical measurement of information leakage

TACAS'10 Proceedings of the 16th international conference on Tools and Algorithms for the Construction and Analysis of Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Mutual information is widely used in artificial intelligence, in a descriptive way, to measure the stochastic dependence of discrete random variables. In order to address questions such as the reliability of the empirical value, one must consider sample-to-population inferential approaches. This paper deals with the distribution of mutual information, as obtained in a Bayesian framework by using second-order Dirichlet prior distributions. We derive reliable and quickly computable analytical approximations for the distribution of mutual information. We concentrate on the mean, variance, skewness, and kurtosis. For the mean we also provide an exact expression. The results are applied to the problem of selecting features for incremental learning and classification of the naive Bayes classifier. A fast, newly defined filter is shown to outperform the traditional approach based on empirical mutual information on a number of real data sets. A theoretical development allows the above methods to be extended to incomplete samples in an easy and effective way. Further experiments on incomplete data sets support the extension of the proposed filter to the case of missing data.