Handling numeric attributes when comparing Bayesian network classifiers: does the discretization method matter?

Authors:
M. Julia Flores;José A. Gámez;Ana M. Martínez;José M. Puerta
Affiliations:
Computer Systems Department, Intelligent Systems & Data Mining--SIMD, I3A, University of Castilla-La Mancha, Albacete, Spain;Computer Systems Department, Intelligent Systems & Data Mining--SIMD, I3A, University of Castilla-La Mancha, Albacete, Spain;Computer Systems Department, Intelligent Systems & Data Mining--SIMD, I3A, University of Castilla-La Mancha, Albacete, Spain;Computer Systems Department, Intelligent Systems & Data Mining--SIMD, I3A, University of Castilla-La Mancha, Albacete, Spain
Venue:
Applied Intelligence
Year:
2011

Citing 13
Cited 5

Bayesian Network Classifiers

Machine Learning - Special issue on learning with probabilistic representations
MultiBoosting: A Technique for Combining Boosting and Wagging

Machine Learning
Lazy Learning of Bayesian Rules

Machine Learning
A Guide to the Literature on Learning Probabilistic Networks from Data

IEEE Transactions on Knowledge and Data Engineering
Why Discretization Works for Naive Bayesian Classifiers

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Implications of the Dirichlet Assumption for Discretization of Continuous Variables in Naive Bayesian Classifiers

Machine Learning
Not So Naive Bayes: Aggregating One-Dependence Estimators

Machine Learning
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Statistical Comparisons of Classifiers over Multiple Data Sets

The Journal of Machine Learning Research
Wrapper discretization by means of estimation of distribution algorithms

Intelligent Data Analysis
Discretization for naive-Bayes learning: managing discretization bias and variance

Machine Learning
GAODE and HAODE: two proposals based on AODE to deal with continuous variables

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Estimating continuous distributions in Bayesian classifiers

UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence

A hybrid discretization method for naïve Bayesian classifiers

Pattern Recognition
Non-Disjoint discretization for aggregating one-dependence estimator classifiers

HAIS'12 Proceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part II
Domains of competence of the semi-naive Bayesian network classifiers

Information Sciences: an International Journal
Regularized Gaussian Mixture Model based discretization for gene expression data association mining

Applied Intelligence
Learning mixtures of polynomials of multidimensional probability densities from data using B-spline interpolation

International Journal of Approximate Reasoning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Within the framework of Bayesian networks (BNs), most classifiers assume that the variables involved are of a discrete nature, but this assumption rarely holds in real problems. Despite the loss of information discretization entails, it is a direct easy-to-use mechanism that can offer some benefits: sometimes discretization improves the run time for certain algorithms; it provides a reduction in the value set and then a reduction in the noise which might be present in the data; in other cases, there are some Bayesian methods that can only deal with discrete variables. Hence, even though there are many ways to deal with continuous variables other than discretization, it is still commonly used. This paper presents a study of the impact of using different discretization strategies on a set of representative BN classifiers, with a significant sample consisting of 26 datasets. For this comparison, we have chosen Naive Bayes (NB) together with several other semi-Naive Bayes classifiers: Tree-Augmented Naive Bayes (TAN), k-Dependence Bayesian (KDB), Aggregating One-Dependence Estimators (AODE) and Hybrid AODE (HAODE). Also, we have included an augmented Bayesian network created by using a hill climbing algorithm (BNHC). With this comparison we analyse to what extent the type of discretization method affects classifier performance in terms of accuracy and bias-variance discretization. Our main conclusion is that even if a discretization method produces different results for a particular dataset, it does not really have an effect when classifiers are being compared. That is, given a set of datasets, accuracy values might vary but the classifier ranking is generally maintained. This is a very useful outcome, assuming that the type of discretization applied is not decisive future experiments can be d times faster, d being the number of discretization methods considered.