Instance-Based Learning Algorithms
Machine Learning
Unifying instance-based and rule-based induction
Machine Learning
Extending naïve Bayes classifiers using long itemsets
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Reduction Techniques for Instance-BasedLearning Algorithms
Machine Learning
ACM Transactions on Database Systems (TODS)
ECML '95 Proceedings of the 8th European Conference on Machine Learning
CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Improvements to Platt's SMO Algorithm for SVM Classifier Design
Neural Computation
IEEE Transactions on Knowledge and Data Engineering
Using Kullback-Leibler distance for text categorization
ECIR'03 Proceedings of the 25th European conference on IR research
Hi-index | 0.00 |
Instance-based classifiers that compute similarity between instances suffer from the presence of noise in the training set and from over-fitting. In this paper we propose a new type of distance-based classifier that instead of computing distances between instances computes the distance between each test instance and the classes. Both are represented by patterns in the space of the frequent itemsets. We ranked the itemsets by metrics of itemset significance. Then we considered only the top portion of the ranking that leads the classifier to reach the maximum accuracy. We have experimented on a large collection of datasets from UCI archive with different proximity measures and different metrics of itemsets ranking. We show that our method has many benefits: it reduces the number of distance computations, improves the classification accuracy of state-of-the art classifiers, like decision trees, SVM, k-nn, Naive Bayes, rule-based classifiers and association rule-based ones and outperforms the competitors especially on noise data.