Boolean Feature Discovery in Empirical Learning
Machine Learning
Rule induction with CN2: some recent improvements
EWSL-91 Proceedings of the European working session on learning on Machine learning
Learning structured concepts using genetic algorithms
ML92 Proceedings of the ninth international workshop on Machine learning
FOSSIL: a robust relational learner
ECML-94 Proceedings of the European conference on machine learning on Machine Learning
Pattern Recognition Letters
Pruning Algorithms for Rule Learning
Machine Learning
Explora: a multipattern and multistrategy discovery assistant
Advances in knowledge discovery and data mining
Separate-and-Conquer Rule Learning
Artificial Intelligence Review
A simple, fast, and effective rule learner
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
An Adjustable Description Quality Measure for Pattern Discovery Usingthe AQ Methodology
Journal of Intelligent Information Systems - Special issue on methodologies for intelligent information systems
Robust Classification for Imprecise Environments
Machine Learning
Information Retrieval
Learning Logical Definitions from Relations
Machine Learning
Machine Learning
Machine Learning
Learning Decision Trees Using the Area Under the ROC Curve
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
A Quantification of Distance Bias Between Evaluation Metrics In Classification
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
ALT '95 Proceedings of the 6th International Conference on Algorithmic Learning Theory
The Journal of Machine Learning Research
Subgroup Discovery with CN2-SD
The Journal of Machine Learning Research
OPUS: an efficient admissible algorithm for unordered search
Journal of Artificial Intelligence Research
Expert-guided subgroup discovery: methodology and application
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
This paper provides an analysis of the behavior of separate-and-conquer or covering rule learning algorithms by visualizing their evaluation metrics and their dynamics in coverage space, a variant of ROC space. Our results show that most commonly used metrics, including accuracy, weighted relative accuracy, entropy, and Gini index, are equivalent to one of two fundamental prototypes: precision, which tries to optimize the area under the ROC curve for unknown costs, and a cost-weighted difference between covered positive and negative examples, which tries to find the optimal point under known or assumed costs. We also show that a straightforward generalization of the m-estimate trades off these two prototypes. Furthermore, our results show that stopping and filtering criteria like CN2's significance test focus on identifying significant deviations from random classification, which does not necessarily avoid overfitting. We also identify a problem with Foil's MDL-based encoding length restriction, which proves to be largely equivalent to a variable threshold on the recall of the rule. In general, we interpret these results as evidence that, contrary to common conception, pre-pruning heuristics are not very well understood and deserve more investigation.