Learning TAN from incomplete data

Authors:
Fengzhan Tian;Zhihai Wang;Jian Yu;Houkuan Huang
Affiliations:
School of Computer & Information Technology, Beijing Jiaotong University, Beijing, P. R. China;School of Computer & Information Technology, Beijing Jiaotong University, Beijing, P. R. China;School of Computer & Information Technology, Beijing Jiaotong University, Beijing, P. R. China;School of Computer & Information Technology, Beijing Jiaotong University, Beijing, P. R. China
Venue:
ICIC'05 Proceedings of the 2005 international conference on Advances in Intelligent Computing - Volume Part I
Year:
2005

Citing 8
Cited 0

On the Optimality of the Simple Bayesian Classifier under Zero-One Loss

Machine Learning - Special issue on learning with probabilistic representations
Bayesian Network Classifiers

Machine Learning - Special issue on learning with probabilistic representations
Robust Bayes classifiers

Artificial Intelligence
Face detection by aggregated Bayesian network classifiers

Pattern Recognition Letters - In memory of Professor E.S. Gelsema
Semi-Naive Bayesian Classifier

EWSL '91 Proceedings of the European Working Session on Machine Learning
Building classifiers using Bayesian networks

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
An analysis of Bayesian classifiers

AAAI'92 Proceedings of the tenth national conference on Artificial intelligence
Comparing Bayesian network classifiers

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Tree augmented Naive Bayes (TAN) classifier is a good tradeoff between the model complexity and learnability in practice. Since there are few complete datasets in real world, in this paper, we develop research on how to efficiently learn TAN from incomplete data. We first present an efficient method that could estimate conditional Mutual Information directly from incomplete data. And then we extend basic TAN learning algorithm to incomplete data using our conditional Mutual Information estimation method. Finally, we carry out experiments to evaluate the extended TAN and compare it with basic TAN. The experimental results show that the accuracy of the extended TAN is much higher than that of basic TAN on most of the incomplete datasets. Despite more time consumption of the extended TAN compared with basic TAN, it is still acceptable. Our conditional Mutual Information estimation method can be easily combined with other techniques to improve TAN further.