On Concentration of Discrete Distributions with Applications to Supervised Learning of Classifiers

Authors:
Magnus Ekdahl;Timo Koski
Affiliations:
Department of Mathematics, Linköpings University, SE-581 83 Linköping, Sweden;Department of Mathematics, Linköpings University, SE-581 83 Linköping, Sweden
Venue:
MLDM '07 Proceedings of the 5th international conference on Machine Learning and Data Mining in Pattern Recognition
Year:
2007

Citing 8
Cited 0

A guided tour of Chernoff bounds

Information Processing Letters
Artificial intelligence: a modern approach

Artificial intelligence: a modern approach
Learning Bayesian Networks: The Combination of Knowledge and Statistical Data

Machine Learning
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss

Machine Learning - Special issue on learning with probabilistic representations
Pattern Recognition and Neural Networks

Pattern Recognition and Neural Networks
Learning equivalence classes of bayesian-network structures

The Journal of Machine Learning Research
Bounds for the Loss in Probability of Correct Classification Under Model Based Approximation

The Journal of Machine Learning Research
Finite mixture model of bounded semi-naive Bayesian networks classifier

ICANN/ICONIP'03 Proceedings of the 2003 joint international conference on Artificial neural networks and neural information processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Computational procedures using independence assumptions in various forms are popular in machine learning, although checks on empirical data have given inconclusive results about their impact. Some theoretical understanding of when they work is available, but a definite answer seems to be lacking. This paper derives distributions that maximizes the statewise difference to the respective product of marginals. These distributions are, in a sense the worst distribution for predicting an outcome of the data generating mechanism by independence. We also restrict the scope of new theoretical results by showing explicitly that, depending on context, independent ('Naïve') classifiers can be as bad as tossing coins. Regardless of this, independence may beat the generating model in learning supervised classification and we explicitly provide one such scenario.