Instance-Based Learning Algorithms
Machine Learning
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
The nature of statistical learning theory
The nature of statistical learning theory
Fast training of support vector machines using sequential minimal optimization
Advances in kernel methods
Reduction Techniques for Instance-BasedLearning Algorithms
Machine Learning
Growing decision trees on support-less association rules
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
ACM Transactions on Database Systems (TODS)
Machine Learning
High-Order Pattern Discovery from Discrete-Valued Data
IEEE Transactions on Knowledge and Data Engineering
ECML '95 Proceedings of the 8th European Conference on Machine Learning
Classification Rule Learning with APRIORI-C
EPIA '01 Proceedings of the10th Portuguese Conference on Artificial Intelligence on Progress in Artificial Intelligence, Knowledge Extraction, Multi-agent Systems, Logic Programming and Constraint Solving
CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
CAEP: Classification by Aggregating Emerging Patterns
DS '99 Proceedings of the Second International Conference on Discovery Science
From Association to Classification: Inference Using Weight of Evidence
IEEE Transactions on Knowledge and Data Engineering
Adapting classification rule induction to subgroup discovery
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
DeEPs: A New Instance-Based Lazy Discovery and Classification System
Machine Learning
Improvements to Platt's SMO Algorithm for SVM Classifier Design
Neural Computation
IEEE Transactions on Knowledge and Data Engineering
Boosting an Associative Classifier
IEEE Transactions on Knowledge and Data Engineering
Maximally informative k-itemsets and their efficient discovery
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
A Lazy Approach to Associative Classification
IEEE Transactions on Knowledge and Data Engineering
Mining significant graph patterns by leap search
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Direct mining of discriminative and essential frequent patterns via model-based search tree
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
A novel distance-based classifier built on pattern ranking
Proceedings of the 2009 ACM symposium on Applied Computing
Expert-guided subgroup discovery: methodology and application
Journal of Artificial Intelligence Research
Estimating continuous distributions in Bayesian classifiers
UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Maximum independence and mutual information
IEEE Transactions on Information Theory
Nearest neighbor pattern classification
IEEE Transactions on Information Theory
Hi-index | 0.01 |
Current work on assembling a set of local patterns such as rules and class association rules into a global model for the prediction of a target usually focuses on the identification of the minimal set of patterns that cover the training data. In this paper we present a different point of view: the model of a class has been built with the purpose to emphasize the typical features of the examples of the class. Typical features are modeled by frequent itemsets extracted from the examples and constitute a new representation space of the examples of the class. Prediction of the target class of test examples occurs by computation of the distance between the vector representing the example in the space of the itemsets of each class and the vectors representing the classes. It is interesting to observe that in the distance computation the critical contribution to the discrimination between classes is given not only by the itemsets of the class model that match the example but also by itemsets that do not match the example. These absent features constitute some pieces of information on the examples that can be considered for the prediction and should not be disregarded. Second, absent features are more abundant in the wrong classes than in the correct ones and their number increases the distance between the example vector and the negative class vectors. Furthermore, since absent features are frequent features in their respective classes, they make the prediction more robust against over-fitting and noise. The usage of features absent in the test example is a novel issue in classification: existing learners usually tend to select the best local pattern that matches the example and do not consider the abundance of other patterns that do not match it. We demonstrate the validity of our observations and the effectiveness of LODE, our learner, by means of extensive empirical experiments in which we compare the prediction accuracy of LODE with a consistent set of classifiers of the state of the art. In this paper we also report the methodology that we adopted in order to determine automatically the setting of the learner and of its parameters.