Finding robust models using a stratified design

  • Authors:
  • Rohan A. Baxter

  • Affiliations:
  • Analytics Project, Office of the Chief Knowledge Officer, Australian Taxation Office, ACT

  • Venue:
  • AI'06 Proceedings of the 19th Australian joint conference on Artificial Intelligence: advances in Artificial Intelligence
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Predictive performance in model selection is often estimated using out-of-sample validation and test datasets. The assumption is that the test and validation datasets are from the same population as the training dataset. This assumption may not apply in the common application context where the model is applied to scoring of future data. This paper proposes a sample design which can lead to better model performance and robust estimates of model generalization error. The sample design is shown applied to a collection scoring application.