Residual variance estimation in machine learning

Authors:
Elia Liitiäinen;Michel Verleysen;Francesco Corona;Amaury Lendasse
Affiliations:
Department of Information and Computer Science, Helsinki University of Technology, P.O. Box 5400, Espoo, Finland;Machine Learning Group, Université Catholique de Louvain, 3 Place du Levant, B-1348 Louvain-la-Neuve, Belgium;Department of Information and Computer Science, Helsinki University of Technology, P.O. Box 5400, Espoo, Finland;Department of Information and Computer Science, Helsinki University of Technology, P.O. Box 5400, Espoo, Finland
Venue:
Neurocomputing
Year:
2009

Citing 8
Cited 2

Variance estimation for high-dimensional regression models

Journal of Multivariate Analysis
Methodology for long-term prediction of time series

Neurocomputing
On Nonparametric Residual Variance Estimation

Neural Processing Letters
The differogram: Non-parametric noise variance estimation and its use for model selection

Neurocomputing
Non-parametric residual variance estimation in supervised learning

IWANN'07 Proceedings of the 9th international work conference on Artificial neural networks
Input selection for radial basis function networks by constrained optimization

ICANN'07 Proceedings of the 17th international conference on Artificial neural networks
LS-SVM hyperparameter selection with a nonparametric noise estimator

ICANN'05 Proceedings of the 15th international conference on Artificial neural networks: formal models and their applications - Volume Part II
Rates of convergence of nearest neighbor estimation under arbitrary sampling

IEEE Transactions on Information Theory

Automatic clustering-based identification of autoregressive fuzzy inference models for time series

Neurocomputing
Review: Data-derived soft-sensors for biological wastewater treatment plants: An overview

Environmental Modelling & Software

Quantified Score

Hi-index	0.01

Visualization

Abstract

The problem of residual variance estimation consists of estimating the best possible generalization error obtainable by any model based on a finite sample of data. Even though it is a natural generalization of linear correlation, residual variance estimation in its general form has attracted relatively little attention in machine learning. In this paper, we examine four different residual variance estimators and analyze their properties both theoretically and experimentally to understand better their applicability in machine learning problems. The theoretical treatment differs from previous work by being based on a general formulation of the problem covering also heteroscedastic noise in contrary to previous work, which concentrates on homoscedastic and additive noise. In the second part of the paper, we demonstrate practical applications in input and model structure selection. The experimental results show that using residual variance estimators in these tasks gives good results often with a reduced computational complexity, while the nearest neighbor estimators are simple and easy to implement.