Robust model-based sampling designs

  • Authors:
  • A. H. Welsh;Douglas P. Wiens

  • Affiliations:
  • Centre for Mathematics and its Applications, The Australian National University, Canberra, Australia ACT 0200;Department of Mathematical and Statistical Sciences, University of Alberta, Edmonton, Canada T6G 2G1

  • Venue:
  • Statistics and Computing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

We investigate methods for the design of sample surveys, and address the traditional resistance of survey samplers to the use of model-based methods by incorporating model robustness at the design stage. The designs are intended to be sufficiently flexible and robust that resulting estimates, based on the designer's best guess at an appropriate model, remain reasonably accurate in a neighbourhood of this central model. Thus, consider a finite population of N units in which a survey variable Y is related to a q dimensional auxiliary variable x. We assume that the values of x are known for all N population units, and that we will select a sample of n≤N population units and then observe the n corresponding values of Y. The objective is to predict the population total $T=\sum_{i=1}^{N}Y_{i}$. The design problem which we consider is to specify a selection rule, using only the values of the auxiliary variable, to select the n units for the sample so that the predictor has optimal robustness properties. We suppose that T will be predicted by methods based on a linear relationship between Y--possibly transformed--and given functions of x. We maximise the mean squared error of the prediction of T over realistic neighbourhoods of the fitted linear relationship, and of the assumed variance and correlation structures. This maximised mean squared error is then minimised over the class of possible samples, yielding an optimally robust (`minimax') design. To carry out the minimisation step we introduce a genetic algorithm and discuss its tuning for maximal efficiency.