Cluster-based visualisation with scatter matrices

  • Authors:
  • P. J. G. Lisboa;I. O. Ellis;A. R. Green;F. Ambrogi;M. B. Dias

  • Affiliations:
  • School of Computing and Mathematical Sciences, John Moores University, Liverpool L3 3AF, UK;Department of Histopathology, School of Molecular Medical Sciences, Nottingham University Hospitals Trust, University of Nottingham, NG5 1PB, UK;Department of Histopathology, School of Molecular Medical Sciences, Nottingham University Hospitals Trust, University of Nottingham, NG5 1PB, UK;Department of Preventive and Predictive Medicine, Istituto Nazionale per lo Studio e la Cura dei Tumori, 20133 Milano, Italy;Mathematics and Psychological Sciences, Unilever Corporate Research, Sharnbrook, Bedfordshire MK44 1LQ, UK

  • Venue:
  • Pattern Recognition Letters
  • Year:
  • 2008

Quantified Score

Hi-index 0.10

Visualization

Abstract

The trace of the scatter matrix, which measures separation between population cohorts, is shown to be strictly preserved by sphering the data followed by a projection onto the space of population means. This result suggests using the space of means as a basis to calculate well-separating lower-dimensional projections of the data, derived from the scatter matrix in the projective space. In particular, it defines an approximation to the canonical decomposition of the scatter matrix that applies for singular covariance matrices. The method is illustrated with reference to k-means clusters in data sets from bioinformatics and marketing.