Why is rule learning optimistic and how to correct it

  • Authors:
  • Martin Možina;Janez Demšar;Jure Žabkar;Ivan Bratko

  • Affiliations:
  • Faculty of Computer and Information Science, University of Ljubljana, Ljubljana;Faculty of Computer and Information Science, University of Ljubljana, Ljubljana;Faculty of Computer and Information Science, University of Ljubljana, Ljubljana;Faculty of Computer and Information Science, University of Ljubljana, Ljubljana

  • Venue:
  • ECML'06 Proceedings of the 17th European conference on Machine Learning
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In their search through a huge space of possible hypotheses, rule induction algorithms compare estimations of qualities of a large number of rules to find the one that appears to be best. This mechanism can easily find random patterns in the data which will – even though the estimating method itself may be unbiased (such as relative frequency) – have optimistically high quality estimates. It is generally believed that the problem, which eventually leads to overfitting, can be alleviated by using m-estimate of probability. We show that this can only partially mend the problem, and propose a novel solution to making the common rule evaluation functions account for multiple comparisons in the search. Experiments on artificial data sets and data sets from the UCI repository show a large improvement in accuracy of probability predictions and also a decent gain in AUC of the constructed models.