Frequentist Model Averaging with missing observations

Authors:
Michael Schomaker;Alan T. K. Wan;Christian Heumann
Affiliations:
Ludwig Maximilian University of Munich, Department of Statistics, Akademiestr. 1, 80799 München, Germany;City University of Hong Kong, Department of Management Sciences, 83 Tat Chee Avenue, Kowloon, Hong Kong;Ludwig Maximilian University of Munich, Department of Statistics, Akademiestr. 1, 80799 München, Germany
Venue:
Computational Statistics & Data Analysis
Year:
2010

Citing 4
Cited 4

Coaching variables for regression and classification

Statistics and Computing
Generalized Additive Models (Texts in Statistical Science)

Generalized Additive Models (Texts in Statistical Science)
On properties of predictors derived with a two-step bootstrap model averaging approach-A simulation study in the linear regression model

Computational Statistics & Data Analysis
Information Theory and Mixing Least-Squares Regressions

IEEE Transactions on Information Theory

Order selection tests with multiply imputed data

Computational Statistics & Data Analysis
Editorial: Special issue on variable selection and robust procedures

Computational Statistics & Data Analysis
Weighted average least squares estimation with nonspherical disturbances and an application to the Hong Kong housing market

Computational Statistics & Data Analysis
Model selection and model averaging after multiple imputation

Computational Statistics & Data Analysis

Quantified Score

Hi-index	0.03

Visualization

Abstract

Model averaging or combining is often considered as an alternative to model selection. Frequentist Model Averaging (FMA) is considered extensively and strategies for the application of FMA methods in the presence of missing data based on two distinct approaches are presented. The first approach combines estimates from a set of appropriate models which are weighted by scores of a missing data adjusted criterion developed in the recent literature of model selection. The second approach averages over the estimates of a set of models with weights based on conventional model selection criteria but with the missing data replaced by imputed values prior to estimating the models. For this purpose three easy-to-use imputation methods that have been programmed in currently available statistical software are considered, and a simple recursive algorithm is further adapted to implement a generalized regression imputation in a way such that the missing values are predicted successively. The latter algorithm is found to be quite useful when one is confronted with two or more missing values simultaneously in a given row of observations. Focusing on a binary logistic regression model, the properties of the FMA estimators resulting from these strategies are explored by means of a Monte Carlo study. The results show that in many situations, averaging after imputation is preferred to averaging using weights that adjust for the missing data, and model average estimators often provide better estimates than those resulting from any single model. As an illustration, the proposed methods are applied to a dataset from a study of Duchenne muscular dystrophy detection.