OFFD: Optimal Flexible Frequency Discretization for Naïve Bayes Classification

Authors:
Song Wang;Fan Min;Zhihai Wang;Tianyu Cao
Affiliations:
Department of Computer Science, The University of Vermont, Burlington, USA 05405;Department of Computer Science, The University of Vermont, Burlington, USA 05405 and Department of Computer Science and Engineering, University of Electronic Science and Technology of China, Sichu ...;School of Computer Science and Information Technology, Beijing Jiaotong University, Beijing, China 100044;Department of Computer Science, The University of Vermont, Burlington, USA 05405
Venue:
ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
Year:
2009

Citing 8
Cited 1

On changing continuous attributes into ordered discrete attributes

EWSL-91 Proceedings of the European working session on learning on Machine learning
Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss

Machine Learning - Special issue on learning with probabilistic representations
Discretization: An Enabling Technique

Data Mining and Knowledge Discovery
Proportional k-Interval Discretization for Naive-Bayes Classifiers

EMCL '01 Proceedings of the 12th European Conference on Machine Learning
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Discretization for naive-Bayes learning: managing discretization bias and variance

Machine Learning
Incremental discretization for Naïve-Bayes classifier

ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications

Improving naive Bayes classifier using conditional probabilities

AusDM '11 Proceedings of the Ninth Australasian Data Mining Conference - Volume 121

Quantified Score

Hi-index	0.00

Visualization

Abstract

Incremental Flexible Frequency Discretization (IFFD) is a recently proposed discretization approach for Naïve Bayes (NB). IFFD performs satisfactory by setting the minimal interval frequency for discretized intervals as a fixed number. In this paper, we first argue that this setting cannot guarantee optimal classification performance in terms of classification error. We observed empirically that an optimal minimal interval frequency existed for each dataset. We thus proposed a sequential search and wrapper based incremental discretization method for NB: named Optimal Flexible Frequency Discretization (OFFD). Experiments were conducted on 17 datasets from UCI machine learning repository and performance was compared between NB trained on the data discretized by OFFD, IFFD, PKID, and FFD respectively. Results show that OFFD works better than these alternatives for NB. Experiments between NB discretized on the data with OFFD and C4.5 showed that our new method outperforms C4.5 on most of the datasets we have tested.