On the Optimality of the Simple Bayesian Classifier under Zero-One Loss

  • Authors:
  • Pedro Domingos;Michael Pazzani

  • Affiliations:
  • Department of Information and Computer Science, University of California, Irvine, CA 92697. E-mail: pedrod@ics.uci.edu, pazzani@ics.uci.edu;Department of Information and Computer Science, University of California, Irvine, CA 92697. E-mail: pedrod@ics.uci.edu, pazzani@ics.uci.edu

  • Venue:
  • Machine Learning - Special issue on learning with probabilistic representations
  • Year:
  • 1997

Quantified Score

Hi-index 0.05

Visualization

Abstract

The simple Bayesian classifier is known to be optimal whenattributes are independent given the class, but the question of whetherother sufficient conditions for its optimality exist has so far not beenexplored. Empirical results showing that it performs surprisingly well inmany domains containing clear attribute dependences suggest that the answerto this question may be positive. This article shows that, although theBayesian classifier‘s probability estimates are only optimal under quadraticloss if the independence assumption holds, the classifier itself can beoptimal under zero-one loss (misclassification rate) even when thisassumption is violated by a wide margin. The region of quadratic-lossoptimality of the Bayesian classifier is in fact a second-orderinfinitesimal fraction of the region of zero-one optimality. This impliesthat the Bayesian classifier has a much greater range of applicability thanpreviously thought. For example, in this article it is shown to be optimalfor learning conjunctions and disjunctions, even though they violate theindependence assumption. Further, studies in artificial domains show that itwill often outperform more powerful classifiers for common training setsizes and numbers of attributes, even if its bias is apriori much less appropriate to the domain. This article‘s resultsalso imply that detecting attribute dependence is not necessarily the bestway to extend the Bayesian classifier, and this is also verifiedempirically.