Game-theoretic probability combination with applications to resolving conflicts between statistical methods

  • Authors:
  • David R. Bickel

  • Affiliations:
  • Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology, and Immunology, Department of Mathematics and Statistics, University of Ottawa, 451 Smyth Road, Ottawa, Ontario, Cana ...

  • Venue:
  • International Journal of Approximate Reasoning
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the typical analysis of a data set, a single method is selected for statistical reporting even when equally applicable methods yield very different results. Examples of equally applicable methods can correspond to those of different ancillary statistics in frequentist inference and of different prior distributions in Bayesian inference. More broadly, choices are made between parametric and nonparametric methods and between frequentist and Bayesian methods. Rather than choosing a single method, it can be safer, in a game-theoretic sense, to combine those that are equally appropriate in light of the available information. Since methods of combining subjectively assessed probability distributions are not objective enough for that purpose, this paper introduces a method of distribution combination that does not require any assignment of distribution weights. It does so by formalizing a hedging strategy in terms of a game between three players: nature, a statistician combining distributions, and a statistician refusing to combine distributions. The optimal move of the first statistician reduces to the solution of a simpler problem of selecting an estimating distribution that minimizes the Kullback-Leibler loss maximized over the plausible distributions to be combined. The resulting combined distribution is a linear combination of the most extreme of the distributions to be combined that are scientifically plausible. The optimal weights are close enough to each other that no extreme distribution dominates the others. The new methodology is illustrated by combining conflicting empirical Bayes methods in the context of gene expression data analysis.