Normal distribution based pseudo ML for missing data: With applications to mean and covariance structure analysis

Authors:
Ke-Hai Yuan
Affiliations:
Department of Psychology, University of Notre Dame, Notre Dame, IN 46556, United States
Venue:
Journal of Multivariate Analysis
Year:
2009

Citing 5
Cited 2

Statistical analysis with missing data

Statistical analysis with missing data
A theorem on uniform convergence of stochastic functions with applications

Journal of Multivariate Analysis
ML estimation of the multivariate t distribution and the EM algorithm

Journal of Multivariate Analysis
Asymptotics of estimating equations under natural conditions

Journal of Multivariate Analysis
Theory and method for constrained estimation in structural equation models with incomplete data

Computational Statistics & Data Analysis

Analysis of NMAR missing data without specifying missing-data mechanisms in a linear latent variate model

Journal of Multivariate Analysis
Consistency, bias and efficiency of the normal-distribution-based MLE: The role of auxiliary variables

Journal of Multivariate Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

When missing data are either missing completely at random (MCAR) or missing at random (MAR), the maximum likelihood (ML) estimation procedure preserves many of its properties. However, in any statistical modeling, the distribution specification for the likelihood function is at best only an approximation to the real world. In particular, since the normal-distribution-based ML is typically applied to data with heterogeneous marginal skewness and kurtosis, it is necessary to know whether such a practice still generates consistent parameter estimates. When the manifest variables are linear combinations of independent random components and missing data are MAR, this paper shows that the normal-distribution-based MLE is consistent regardless of the distribution of the sample. Examples also show that the consistency of the MLE is not guaranteed for all nonnormally distributed samples. When the population follows a confirmatory factor model, and data are missing due to the magnitude of the factors, the MLE may not be consistent even when data are normally distributed. When data are missing due to the magnitude of measurement errors/uniqueness, MLEs for many of the covariance parameters related to the missing variables are still consistent. This paper also identifies and discusses the factors that affect the asymptotic biases of the MLE when data are not missing at random. In addition, the paper also shows that, under certain data models and MAR mechanism, the MLE is asymptotically normally distributed and the asymptotic covariance matrix is consistently estimated by the commonly used sandwich-type covariance matrix. The results indicate that certain formulas and/or conclusions in the existing literature may not be entirely correct.