Cohort-based kernel visualisation with scatter matrices

  • Authors:
  • Enrique Romero;Tingting Mu;Paulo J. G. Lisboa

  • Affiliations:
  • Departament de Llenguatges i Sistemes Informítics, Universitat Politècnica de Catalunya, Spain;School of Computing, Informatics and Media, University of Bradford, UK;School of Computing and Mathematical Sciences, Liverpool John Moores University, UK

  • Venue:
  • Pattern Recognition
  • Year:
  • 2012

Quantified Score

Hi-index 0.01

Visualization

Abstract

Visualisation with good discrimination between data cohorts is important for exploratory data analysis and for decision support interfaces. This paper proposes a kernel extension of the cluster-based linear visualisation method described in Lisboa et al. [15]. A representation of the data in dual form permits the application of the kernel trick, so projecting the data onto the orthonormalised cohort means in the feature space. The only parameters of the method are those for the kernel function. The method is shown to obtain well-discriminating visualisations of non-linearly separable data with low computational cost. The linearity of the visualisation was tested using nearest neighbour and linear discriminant classifiers, achieving significant improvements in classification accuracy with respect to the original features, especially for high-dimensional data, where 93% accuracy was obtained for the Splice-junction Gene Sequences data set from the UCI repository.