A minimal description length scheme for polynomial regression

Authors:
Aleksandar Pečkov;Sašo Džeroski;Ljupčo Todorovski
Affiliations:
Jozef Stefan Institute, Ljubljana, Slovenia;Jozef Stefan Institute, Ljubljana, Slovenia;Jozef Stefan Institute, Ljubljana, Slovenia
Venue:
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Year:
2008

Citing 3
Cited 0

Advances in Minimum Description Length: Theory and Applications (Neural Information Processing)

Advances in Minimum Description Length: Theory and Applications (Neural Information Processing)
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
MDL denoising

IEEE Transactions on Information Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

The paper addresses the task of polynomial regression, i.e., the task of inducing polynomials from numeric data that can be used to predict the value of a selected numeric variable. As in other learning tasks, we face the problem of finding an optimal trade-off between the complexity of the induced model and its predictive error. One of the approaches to finding this optimal trade-off is the minimal description length (MDL) principle. In this paper, we propose an MDL scheme for polynomial regression, which includes coding schemes for polynomials and the errors they make on data. We empirically compare this principled MDL scheme to an ad-hoc MDL scheme and show that it performs better. The improvements in performance are such that the polynomial regression approach we propose is now comparable in performance to other commonly used methods for regression, such as model trees.