Why is rule learning optimistic and how to correct it

Authors:
Martin Možina;Janez Demšar;Jure Žabkar;Ivan Bratko
Affiliations:
Faculty of Computer and Information Science, University of Ljubljana, Ljubljana;Faculty of Computer and Information Science, University of Ljubljana, Ljubljana;Faculty of Computer and Information Science, University of Ljubljana, Ljubljana;Faculty of Computer and Information Science, University of Ljubljana, Ljubljana
Venue:
ECML'06 Proceedings of the 17th European conference on Machine Learning
Year:
2006

Citing 7
Cited 4

Multiple Comparisons in Induction Algorithms

Machine Learning
The CN2 Induction Algorithm

Machine Learning
Rule Induction with CN2: Some Recent Improvements

EWSL '91 Proceedings of the European Working Session on Machine Learning
Predictive Performance of Weghted Relative Accuracy

PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Rule Evaluation Measures: A Unifying View

ILP '99 Proceedings of the 9th International Workshop on Inductive Logic Programming
Accurate methods for the statistics of surprise and coincidence

Computational Linguistics - Special issue on using large corpora: I
ROC `n' Rule Learning—Towards a Better Understanding of Covering Algorithms

Machine Learning

Argument based machine learning

Artificial Intelligence
Layered critical values: a powerful direct-adjustment approach to discovering significant patterns

Machine Learning
Argument based machine learning in a medical domain

Proceedings of the 2006 conference on Computational Models of Argument: Proceedings of COMMA 2006
Argument-Based machine learning

ISMIS'06 Proceedings of the 16th international conference on Foundations of Intelligent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In their search through a huge space of possible hypotheses, rule induction algorithms compare estimations of qualities of a large number of rules to find the one that appears to be best. This mechanism can easily find random patterns in the data which will – even though the estimating method itself may be unbiased (such as relative frequency) – have optimistically high quality estimates. It is generally believed that the problem, which eventually leads to overfitting, can be alleviated by using m-estimate of probability. We show that this can only partially mend the problem, and propose a novel solution to making the common rule evaluation functions account for multiple comparisons in the search. Experiments on artificial data sets and data sets from the UCI repository show a large improvement in accuracy of probability predictions and also a decent gain in AUC of the constructed models.