Advances in Minimum Description Length: Theory and Applications (Neural Information Processing)
Advances in Minimum Description Length: Theory and Applications (Neural Information Processing)
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
IEEE Transactions on Information Theory
Hi-index | 0.00 |
The paper addresses the task of polynomial regression, i.e., the task of inducing polynomials from numeric data that can be used to predict the value of a selected numeric variable. As in other learning tasks, we face the problem of finding an optimal trade-off between the complexity of the induced model and its predictive error. One of the approaches to finding this optimal trade-off is the minimal description length (MDL) principle. In this paper, we propose an MDL scheme for polynomial regression, which includes coding schemes for polynomials and the errors they make on data. We empirically compare this principled MDL scheme to an ad-hoc MDL scheme and show that it performs better. The improvements in performance are such that the polynomial regression approach we propose is now comparable in performance to other commonly used methods for regression, such as model trees.