Improving Tree augmented Naive Bayes for class probability estimation

  • Authors:
  • Liangxiao Jiang;Zhihua Cai;Dianhong Wang;Harry Zhang

  • Affiliations:
  • Department of Computer Science, China University of Geosciences, Wuhan, Hubei 430074, China;Department of Computer Science, China University of Geosciences, Wuhan, Hubei 430074, China;Department of Electronic Engineering, China University of Geosciences, Wuhan, Hubei 430074, China;Faculty of Computer Science, University of New Brunswick Fredericton, New Brunswick, Canada E3B5A3

  • Venue:
  • Knowledge-Based Systems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Numerous algorithms have been proposed to improve Naive Bayes (NB) by weakening its conditional attribute independence assumption, among which Tree Augmented Naive Bayes (TAN) has demonstrated remarkable classification performance in terms of classification accuracy or error rate, while maintaining efficiency and simplicity. In many real-world applications, however, classification accuracy or error rate is not enough. For example, in direct marketing, we often need to deploy different promotion strategies to customers with different likelihood (class probability) of buying some products. Thus, accurate class probability estimation is often required to make optimal decisions. In this paper, we investigate the class probability estimation performance of TAN in terms of conditional log likelihood (CLL) and present a new algorithm to improve its class probability estimation performance by the spanning TAN classifiers. We call our improved algorithm Averaged Tree Augmented Naive Bayes (ATAN). The experimental results on a large number of UCI datasets published on the main web site of Weka platform show that ATAN significantly outperforms TAN and all the other algorithms used to compare in terms of CLL.