Asymptotic Bayesian generalization error when training and test distributions are different

Authors:
Keisuke Yamazaki;Motoaki Kawanabe;Sumio Watanabe;Masashi Sugiyama;Klaus-Robert Müller
Affiliations:
Tokyo Institute of Technology, Midori-ku, Yokohama, Japan;Fraunhofer FIRST, IDA, Berlin, Germany;Tokyo Institute of Technology;Tokyo Institute of Technology, Meguro-ku, Tokyo, Japan;Technical University of Berlin, Berlin, Germany
Venue:
Proceedings of the 24th international conference on Machine learning
Year:
2007

Citing 6
Cited 7

Bioinformatics: the machine learning approach

Bioinformatics: the machine learning approach
Support Vector Machines for Classification in Nonstandard Situations

Machine Learning
Importance sampling for reinforcement learning with multiple objectives

Importance sampling for reinforcement learning with multiple objectives
Algebraic Analysis for Nonidentifiable Learning Machines

Neural Computation
Active Learning in Approximately Linear Regression Based on Conditional Expectation of Generalization Error

The Journal of Machine Learning Research
Covariate Shift Adaptation by Importance Weighted Cross Validation

The Journal of Machine Learning Research

Making generative classifiers robust to selection bias

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Experimental Bayesian Generalization Error of Non-regular Models under Covariate Shift

Neural Information Processing
Latent space domain transfer between high dimensional overlapping distributions

Proceedings of the 18th international conference on World wide web
Cross domain distribution adaptation via kernel mapping

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Improving Classification under Changes in Class and Within-Class Distributions

IWANN '09 Proceedings of the 10th International Work-Conference on Artificial Neural Networks: Part I: Bio-Inspired Systems: Computational and Ambient Intelligence
Assessing the impact of changing environments on classifier performance

Canadian AI'08 Proceedings of the Canadian Society for computational studies of intelligence, 21st conference on Advances in artificial intelligence
A unifying view on dataset shift in classification

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

In supervised learning, we commonly assume that training and test data are sampled from the same distribution. However, this assumption can be violated in practice and then standard machine learning techniques perform poorly. This paper focuses on revealing and improving the performance of Bayesian estimation when the training and test distributions are different. We formally analyze the asymptotic Bayesian generalization error and establish its upper bound under a very general setting. Our important finding is that lower order terms---which can be ignored in the absence of the distribution change---play an important role under the distribution change. We also propose a novel variant of stochastic complexity which can be used for choosing an appropriate model and hyper-parameters under a particular distribution change.