Lazy overfitting control

Authors:
Armand Prieditis;Stephanie Sapp
Affiliations:
Neustar Labs, Mountain View, CA;Neustar Labs, Mountain View, CA
Venue:
MLDM'13 Proceedings of the 9th international conference on Machine Learning and Data Mining in Pattern Recognition
Year:
2013

Citing 11
Cited 0

Keeping the neural networks simple by minimizing the description length of the weights

COLT '93 Proceedings of the sixth annual conference on Computational learning theory
A Comparative Analysis of Methods for Pruning Decision Trees

IEEE Transactions on Pattern Analysis and Machine Intelligence
A bound on the error of cross validation using the approximation and estimation rates, with consequences for the training-test split

Neural Computation
Choosing Multiple Parameters for Support Vector Machines

Machine Learning
Preventing "Overfitting" of Cross-Validation Data

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Efficient Locally Weighted Polynomial Regression Predictions

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling)

Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling)
Measuring the prediction error. A comparison of cross-validation, bootstrap and covariance penalty methods

Computational Statistics & Data Analysis
Completely Lazy Learning

IEEE Transactions on Knowledge and Data Engineering
Lazy decision trees

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Pruning algorithms-a survey

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

A machine learning model is said overfit the training data relative to a simpler model if the first model is more accurate on the training data but less accurate on the test data. Overfitting control--selecting an appropriate complexity fit--is a central problem in machine learning. Previous overfitting control methods include penalty methods, which penalize a model for complexity, cross-validation methods, which experimentally determine when overfitting occurs on the training data relative to the test data, and ensemble methods, which reduce overfitting risk by combining multiple models. These methods are all eager in that they attempt to control overfitting at training time, and they all attempt to improve the average accuracy, as computed over the test data. This paper presents an overfitting control method which is lazy--it attempts to control overfitting at prediction time for each test case. Our results suggest that lazy methods perform well because they exploit the particulars of each test case at prediction time rather than averaging over all possible test cases at training time.