Evaluation of k-Nearest Neighbor classifier performance for direct marketing

  • Authors:
  • M. Govindarajan;RM. Chandrasekaran

  • Affiliations:
  • Department of Computer Science and Engineering, Annamalai University, Annamalai Nagar, 608 002 Tamil Nadu, India;Department of Computer Science and Engineering, Annamalai University, Annamalai Nagar, 608 002 Tamil Nadu, India

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2010

Quantified Score

Hi-index 12.05

Visualization

Abstract

Text data mining is a process of exploratory data analysis. Classification maps data into predefined groups or classes. It is often referred to as supervised learning because the classes are determined before examining the data. This paper describes the proposed k-Nearest Neighbor classifier that performs comparative cross-validation for the existing k-Nearest Neighbor classifier. The feasibility and the benefits of the proposed approach are demonstrated by means of data mining problem: direct marketing. Direct marketing has become an important application field of data mining. Comparative cross-validation involves estimation of accuracy by either stratified k-fold cross-validation or equivalent repeated random subsampling. While the proposed method may have a high bias; its performance (accuracy estimation in our case) may be poor due to a high variance. Thus the accuracy with the proposed k-Nearest Neighbor classifier was less than that with the existing k-Nearest Neighbor classifier, and the smaller the improvement in runtime the larger the improvement in precision and recall. In our proposed method we have determined the classification accuracy and prediction accuracy where the prediction accuracy is comparatively high.