The effect of across-location heteroscedasticity on the classification of mixed categorical and continuous data

  • Authors:
  • Chi-Ying Leung

  • Affiliations:
  • Department of Statistics, The Chinese University of Hong Kong, Shatin NT, Hong Kong, Hong Kong

  • Venue:
  • Journal of Multivariate Analysis
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Classification of mixed categorical and continuous data is often performed using the location linear discriminant function which assumes across-location homoscedasticity. In this paper, we investigate the hazard arising from a routine application of the classifier under across-location heteroscedasticity. A limiting and a first-order asymptotic performance index are proposed and studied in a general setting. The first index studies the limiting behavior. The second index corrects the bias due to the finite sample size. Both indexes are illustrated under the assumption of unequal spherical covariance matrices across all the locations. This is likely to be the case in most classification problems dealing with mixed categorical and continuous data. Results of a numerical study are reported.