Model selection under covariate shift

Authors:
Masashi Sugiyama;Klaus-Robert Müller
Affiliations:
Tokyo Institute of Technology, Tokyo, Japan;Fraunhofer FIRST.IDA, Berlin, and University of Potsdam, Potsdam, Germany
Venue:
ICANN'05 Proceedings of the 15th international conference on Artificial neural networks: formal models and their applications - Volume Part II
Year:
2005

Citing 0
Cited 4

Discriminative learning for differing training and test distributions

Proceedings of the 24th international conference on Machine learning
Making generative classifiers robust to selection bias

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Categorizing and mining concept drifting data streams

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
A large-scale active learning system for topical categorization on the web

Proceedings of the 19th international conference on World wide web

Quantified Score

Hi-index	0.00

Visualization

Abstract

A common assumption in supervised learning is that the training and test input points follow the same probability distribution. However, this assumption is not fulfilled, e.g., in interpolation, extrapolation, or active learning scenarios. The violation of this assumption-- known as the covariate shift--causes a heavy bias in standard generalization error estimation schemes such as cross-validation and thus they result in poor model selection. In this paper, we therefore propose an alternative estimator of the generalization error. Under covariate shift, the proposed generalization error estimator is unbiased if the learning target function is included in the model at hand and it is asymptotically unbiased in general. Experimental results show that model selection with the proposed generalization error estimator is compared favorably to crossvalidation in extrapolation.