Comparison of inference methods for estimating semivariogram model parameters and their uncertainty: The case of small data sets

  • Authors:
  • Eulogio Pardo-IgúZquiza;Peter A. Dowd

  • Affiliations:
  • Instituto Geológico y Minero de España (IGME), Geological Survey of Spain, Ríos Rosas 23, 28003 Madrid, Spain;Faculty of Engineering, Computer and Mathematical Sciences, University of Adelaide, Adelaide, SA 5005, Australia

  • Venue:
  • Computers & Geosciences
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The semivariogram model is the fundamental component in all geostatistical applications and its inference is an issue of significant practical interest. The semivariogram model is defined by a mathematical function, the parameters of which are usually estimated from the experimental data. There are important application areas in which small data sets are the norm; rainfall estimation from rain gauge data and transmissivity estimation from pumping test data are two examples from, respectively, surface and subsurface hydrology. Thus a benchmark problem in geostatistics is deciding on the most appropriate method for the inference of the semivariogram model. The various methods for semivariogram inference can be classified as indirect methods, in which there is an intermediate step of calculating the experimental semivariogram, and direct approaches that obtain the model parameter values directly as the values that minimize some objective function. To avoid subjectivity in fitting models to experimental semivariograms, ordinary least squares (OLS), weighted least squares (WLS) and generalized least squares (GLS) are often used. Uncertainty evaluation in indirect methods is done using computationally intensive resampling procedures such as the bootstrap method. Direct methods include parametric methods, such as maximum likelihood (ML) and maximum likelihood cross-validation (MLCV), and non-parametric methods, such as minimization of cross-validation statistics (CV). The bases for comparing the previous methods are the sampling distribution of the various parameters and the ''goodness'' of the uncertainty evaluation in a sense that we define. The final questions to be answered are (1) which is the best method for estimating each of the semivariogram parameters? (2) Which is the best method for assessing the uncertainty of each of the parameters? (3) Which method best selects the functional form of the semivariogram from among a set of options? and (4) which is the best method that jointly addresses all the previous questions?