Statistical analysis with missing data
Statistical analysis with missing data
Probabilistic reasoning in intelligent systems: networks of plausible inference
Probabilistic reasoning in intelligent systems: networks of plausible inference
Elements of information theory
Elements of information theory
C4.5: programs for machine learning
C4.5: programs for machine learning
Selection of relevant features and examples in machine learning
Artificial Intelligence - Special issue on relevance
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss
Machine Learning - Special issue on learning with probabilistic representations
A tutorial on learning with Bayesian networks
Learning in graphical models
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
ACM SIGKDD Explorations Newsletter
Feature Selection for Knowledge Discovery and Data Mining
Feature Selection for Knowledge Discovery and Data Mining
A Guide to the Literature on Learning Probabilistic Networks from Data
IEEE Transactions on Knowledge and Data Engineering
Distribution of mutual information for robust feature selection
Distribution of mutual information for robust feature selection
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Feature selection and feature extraction for text categorization
HLT '91 Proceedings of the workshop on Speech and Natural Language
Cancer classification using Rotation Forest
Computers in Biology and Medicine
A New Approach of Feature Selection for Chinese Web Page Categorization
ISICA '08 Proceedings of the 3rd International Symposium on Advances in Computation and Intelligence
Expert Systems with Applications: An International Journal
Microarray data classification based on ensemble independent component selection
Computers in Biology and Medicine
Real-Time Collaborative Filtering Using Extreme Learning Machine
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
BVAI'07 Proceedings of the 2nd international conference on Advances in brain, vision and artificial intelligence
The design of evolutionary multiple classifier system for the classification of microarray data
ISNN'11 Proceedings of the 8th international conference on Advances in neural networks - Volume Part III
Automatic window design for gray-scale image processing based on entropy minimization
CIARP'05 Proceedings of the 10th Iberoamerican Congress conference on Progress in Pattern Recognition, Image Analysis and Applications
Feature selection for microarray data analysis using mutual information and rough set theory
ICIC'06 Proceedings of the 2006 international conference on Computational Intelligence and Bioinformatics - Volume Part III
Multimedia features for click prediction of new ads in display advertising
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Nearest neighbor estimate of conditional mutual information in feature selection
Expert Systems with Applications: An International Journal
Feature selection via dependence maximization
The Journal of Machine Learning Research
Hi-index | 0.00 |
Mutual information is widely used in artificial intelligence, in a descriptive way, to measure the stochastic dependence of discrete random variables. In order to address questions such as the reliability of the empirical value, one must consider sample-to-population inferential approaches. This paper deals with the distribution of mutual information, as obtained in a Bayesian framework by a second-order Dirichlet prior distribution. The exact analytical expression for the mean and an analytical approximation of the variance are reported. Asymptotic approximations of the distribution are proposed. The results are applied to the problem of selecting features for incremental learning and classification of the naive Bayes classifier. A fast, newly defined method is shown to outperform the traditional approach based on empirical mutual information on a number of real data sets. Finally, a theoretical development is reported that allows one to efficiently extend the above methods to incomplete samples in an easy and effective way.