Improved Estimates for the Accuracy of Small Disjuncts
Machine Learning
Learning hard concepts through constructive induction: framework and rationale
Computational Intelligence
Elements of information theory
Elements of information theory
C4.5: programs for machine learning
C4.5: programs for machine learning
Elements of machine learning
Machine learning, neural and statistical classification
From data mining to knowledge discovery: an overview
Advances in knowledge discovery and data mining
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: concepts and techniques
Data mining: concepts and techniques
Understanding the Crucial Role of AttributeInteraction in Data Mining
Artificial Intelligence Review
Machine Learning
Multi-Objective Optimization Using Evolutionary Algorithms
Multi-Objective Optimization Using Evolutionary Algorithms
Data Mining and Knowledge Discovery with Evolutionary Algorithms
Data Mining and Knowledge Discovery with Evolutionary Algorithms
Data Mining and Knowledge Discovery
Breeding Decision Trees Using Evolutionary Techniques
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
The Problem with Noise and Small Disjuncts
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
A Genetic Algorithm With Sequential Niching For Discovering Small-disjunct Rules
GECCO '02 Proceedings of the Genetic and Evolutionary Computation Conference
A Quantitative Study of Small Disjuncts
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Handbook of data mining and knowledge discovery
A hybrid decision tree/genetic algorithm method for data mining
Information Sciences: an International Journal - Special issue: Soft computing data mining
Evaluating the correlation between objective rule interestingness measures and real human interest
PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
ISIICT'09 Proceedings of the Third international conference on Innovation and Information and Communication Technology
Hi-index | 0.00 |
A set of classification rules can be considered as a disjunction of rules, where each rule is a disjunct. A small disjunct is a rule covering a small number of examples. Small disjuncts are a serious problem for effective classification, because the small number of examples satisfying these rules makes their prediction unreliable and error-prone. This paper offers two main contributions to the research on small disjuncts. First, it investigates six candidate solutions (algorithms) for the problem of small disjuncts. Second, it reports the results of a meta-learning experiment, which produced meta-rules predicting which algorithm will tend to perform best for a given data set. The algorithms investigated in this paper belong to different machine learning paradigms and their hybrid combinations, as follows: two versions of a decision-tree (DT) induction algorithm; two versions of a hybrid DT/genetic algorithm (GA) method; one GA; one hybrid DT/instance-based learning (IBL) algorithm. Experiments with 22 data sets evaluated both the predictive accuracy and the simplicity of the discovered rule sets, with the following conclusions. If one wants to maximize predictive accuracy only, then the hybrid DT/IBL seems to be the best choice. On the other hand, if one wants to maximize both predictive accuracy and rule set simplicity -- which is important in the context of data mining -- then a hybrid DT/GA seems to be the best choice.