Frequentist Model Averaging with missing observations

  • Authors:
  • Michael Schomaker;Alan T. K. Wan;Christian Heumann

  • Affiliations:
  • Ludwig Maximilian University of Munich, Department of Statistics, Akademiestr. 1, 80799 München, Germany;City University of Hong Kong, Department of Management Sciences, 83 Tat Chee Avenue, Kowloon, Hong Kong;Ludwig Maximilian University of Munich, Department of Statistics, Akademiestr. 1, 80799 München, Germany

  • Venue:
  • Computational Statistics & Data Analysis
  • Year:
  • 2010

Quantified Score

Hi-index 0.03

Visualization

Abstract

Model averaging or combining is often considered as an alternative to model selection. Frequentist Model Averaging (FMA) is considered extensively and strategies for the application of FMA methods in the presence of missing data based on two distinct approaches are presented. The first approach combines estimates from a set of appropriate models which are weighted by scores of a missing data adjusted criterion developed in the recent literature of model selection. The second approach averages over the estimates of a set of models with weights based on conventional model selection criteria but with the missing data replaced by imputed values prior to estimating the models. For this purpose three easy-to-use imputation methods that have been programmed in currently available statistical software are considered, and a simple recursive algorithm is further adapted to implement a generalized regression imputation in a way such that the missing values are predicted successively. The latter algorithm is found to be quite useful when one is confronted with two or more missing values simultaneously in a given row of observations. Focusing on a binary logistic regression model, the properties of the FMA estimators resulting from these strategies are explored by means of a Monte Carlo study. The results show that in many situations, averaging after imputation is preferred to averaging using weights that adjust for the missing data, and model average estimators often provide better estimates than those resulting from any single model. As an illustration, the proposed methods are applied to a dataset from a study of Duchenne muscular dystrophy detection.