Improving Tree augmented Naive Bayes for class probability estimation

Authors:
Liangxiao Jiang;Zhihua Cai;Dianhong Wang;Harry Zhang
Affiliations:
Department of Computer Science, China University of Geosciences, Wuhan, Hubei 430074, China;Department of Computer Science, China University of Geosciences, Wuhan, Hubei 430074, China;Department of Electronic Engineering, China University of Geosciences, Wuhan, Hubei 430074, China;Faculty of Computer Science, University of New Brunswick Fredericton, New Brunswick, Canada E3B5A3
Venue:
Knowledge-Based Systems
Year:
2012

Citing 19
Cited 0

Bayesian Network Classifiers

Machine Learning - Special issue on learning with probabilistic representations
A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems

Machine Learning
An Improved Learning Algorithm for Augmented Naive Bayes

PAKDD '01 Proceedings of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining
Toward Bayesian Classifiers with Accurate Probabilities

PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Inference for the Generalization Error

Machine Learning
Active Sampling for Class Probability Estimation and Ranking

Machine Learning
Learning Bayesian network classifiers by maximizing conditional likelihood

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Not So Naive Bayes: Aggregating One-Dependence Estimators

Machine Learning
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Discriminative parameter learning for Bayesian networks

Proceedings of the 25th international conference on Machine learning
Learning decision tree for ranking

Knowledge and Information Systems
Structure identification of Bayesian classifiers based on GMDH

Knowledge-Based Systems
A Novel Bayes Model: Hidden Naive Bayes

IEEE Transactions on Knowledge and Data Engineering
On the classification performance of TAN and general Bayesian networks

Knowledge-Based Systems
Discriminative model selection for belief net structures

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
The use of the area under the ROC curve in the evaluation of machine learning algorithms

Pattern Recognition
Weightily averaged one-dependence estimators

PRICAI'06 Proceedings of the 9th Pacific Rim international conference on Artificial intelligence
NB+: An improved Naïve Bayesian algorithm

Knowledge-Based Systems
Learning naive bayes for probability estimation by feature selection

AI'06 Proceedings of the 19th international conference on Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Numerous algorithms have been proposed to improve Naive Bayes (NB) by weakening its conditional attribute independence assumption, among which Tree Augmented Naive Bayes (TAN) has demonstrated remarkable classification performance in terms of classification accuracy or error rate, while maintaining efficiency and simplicity. In many real-world applications, however, classification accuracy or error rate is not enough. For example, in direct marketing, we often need to deploy different promotion strategies to customers with different likelihood (class probability) of buying some products. Thus, accurate class probability estimation is often required to make optimal decisions. In this paper, we investigate the class probability estimation performance of TAN in terms of conditional log likelihood (CLL) and present a new algorithm to improve its class probability estimation performance by the spanning TAN classifiers. We call our improved algorithm Averaged Tree Augmented Naive Bayes (ATAN). The experimental results on a large number of UCI datasets published on the main web site of Weka platform show that ATAN significantly outperforms TAN and all the other algorithms used to compare in terms of CLL.