Investigating the Performance of Naive- Bayes Classifiers and K- Nearest Neighbor Classifiers

Authors:
Mohammed J. Islam;Q. M. Jonathan Wu;Majid Ahmadi;Maher A. Sid-Ahmed
Affiliations:
-;-;-;-
Venue:
ICCIT '07 Proceedings of the 2007 International Conference on Convergence Information Technology
Year:
2007

Citing 0
Cited 5

Multiple classifier application to credit risk assessment

Expert Systems with Applications: An International Journal
On crossing numbers of geometric proximity graphs

Computational Geometry: Theory and Applications
Predicting and preventing student failure – using the k-nearest neighbour method to predict student performance in an online course environment

International Journal of Learning Technology
Combining Supervised Learning Techniques to Key-Phrase Extraction for Biomedical Full-Text

International Journal of Intelligent Information Technologies
Impact of noise on credit risk prediction: Does data quality really matter?

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Probability theory is the framework for making decision under uncertainty. In classification, Bayes' rule is used to calculate the probabilities of the classes and it is a big issue how to classify raw data rationally to minimize ex- pected risk. Bayesian theory can roughly be boiled down to one principle: to see the future, one must look at the past. Naive Bayes classifier is one of the mostly used practical Bayesian learning methods. K-Nearest Neighbor is a super- vised learning algorithm where the result of new instance query is classified based on majority of K-Nearest Neigh- bor category. The classifiers do not use any model to fit and only based on memory/ training data. In this paper, after re- viewing Bayesian theory the Naive Bayes classifier and K- Nearest Neighbor classifier is implemented and applied to a dataset "Credit card approval" application. Eventually the performance of these two classifiers is observed on this ap- plication in terms of the correct classification and misclas- sification and how the performance of K-Nearest Neighbor classifier can be improved by varying the value of k.