Mixture analysis of multivariate categorical data with covariates and missing entries

  • Authors:
  • Anton K. Formann

  • Affiliations:
  • University of Vienna, Liebiggasse 5, 1010 Vienna, Austria

  • Venue:
  • Computational Statistics & Data Analysis
  • Year:
  • 2007

Quantified Score

Hi-index 0.04

Visualization

Abstract

Longitudinal or otherwise correlated categorical variables are typically related to some covariates and exhibit nonignorable correlations of the observed variables. A further complication often consists in missing entries. For analyzing such data, it is proposed to create an extra missing category and to employ latent class analysis which, regarding missing data, can be shown to belong to the family of nonmissing at random models. By treating the complete and the incomplete cases jointly, it becomes possible to estimate the parameters of interest along with additional parameters characterizing the missing mechanism. Data from the Muscatine Coronary Risk Factor Study, where each child was classified obese or not obese at three occasions, serve as an illustrative example. Previous analyses resulted in significant interaction of age and sex for the complete data (N=460), and in a linear increase in the logit of the rate of obesity over time for the incomplete data, with no effect of the covariate sex (N=1014). Reanalyses employing latent class models do not support these findings. The finally accepted two-classes model for the complete data assumes a linear effect of age which is the same for boys and girls. The incomplete data were considered three-categorical (not obese, obese, missing) and resulted in a more complex model only in part supporting the linear age hypothesis.