A novel distance-based classifier built on pattern ranking

  • Authors:
  • Dipankar Bachar;Rosa Meo

  • Affiliations:
  • Università degli Studi di Torino, Italy;Università degli Studi di Torino, Italy

  • Venue:
  • Proceedings of the 2009 ACM symposium on Applied Computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Instance-based classifiers that compute similarity between instances suffer from the presence of noise in the training set and from over-fitting. In this paper we propose a new type of distance-based classifier that instead of computing distances between instances computes the distance between each test instance and the classes. Both are represented by patterns in the space of the frequent itemsets. We ranked the itemsets by metrics of itemset significance. Then we considered only the top portion of the ranking that leads the classifier to reach the maximum accuracy. We have experimented on a large collection of datasets from UCI archive with different proximity measures and different metrics of itemsets ranking. We show that our method has many benefits: it reduces the number of distance computations, improves the classification accuracy of state-of-the art classifiers, like decision trees, SVM, k-nn, Naive Bayes, rule-based classifiers and association rule-based ones and outperforms the competitors especially on noise data.