Multiple Comparisons in Induction Algorithms
Machine Learning
Machine Learning
Rule Induction with CN2: Some Recent Improvements
EWSL '91 Proceedings of the European Working Session on Machine Learning
Predictive Performance of Weghted Relative Accuracy
PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Rule Evaluation Measures: A Unifying View
ILP '99 Proceedings of the 9th International Workshop on Inductive Logic Programming
Accurate methods for the statistics of surprise and coincidence
Computational Linguistics - Special issue on using large corpora: I
Argument based machine learning
Artificial Intelligence
Argument based machine learning in a medical domain
Proceedings of the 2006 conference on Computational Models of Argument: Proceedings of COMMA 2006
Argument-Based machine learning
ISMIS'06 Proceedings of the 16th international conference on Foundations of Intelligent Systems
Hi-index | 0.00 |
In their search through a huge space of possible hypotheses, rule induction algorithms compare estimations of qualities of a large number of rules to find the one that appears to be best. This mechanism can easily find random patterns in the data which will – even though the estimating method itself may be unbiased (such as relative frequency) – have optimistically high quality estimates. It is generally believed that the problem, which eventually leads to overfitting, can be alleviated by using m-estimate of probability. We show that this can only partially mend the problem, and propose a novel solution to making the common rule evaluation functions account for multiple comparisons in the search. Experiments on artificial data sets and data sets from the UCI repository show a large improvement in accuracy of probability predictions and also a decent gain in AUC of the constructed models.