Direct classification of high-dimensional data in low-dimensional projected feature spaces-Comparison of several classification methodologies

  • Authors:
  • R. L. Somorjai;B. Dolenko;M. Mandelzweig

  • Affiliations:
  • Institute for Biodiagnostics, National Research Council Canada, 435 Ellice Avenue, Winnipeg, Man., Canada R3B 1Y6;Institute for Biodiagnostics, National Research Council Canada, 435 Ellice Avenue, Winnipeg, Man., Canada R3B 1Y6;Institute for Biodiagnostics, National Research Council Canada, 435 Ellice Avenue, Winnipeg, Man., Canada R3B 1Y6

  • Venue:
  • Journal of Biomedical Informatics
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Previously, we introduced a distance (similarity)-based mapping for the visualization of high-dimensional patterns and their relative relationships. The mapping preserves exactly the original distances from all points to any two reference patterns in a special two-dimensional coordinate system, the relative distance plane (RDP). We extend the RDP mapping's applicability from visualization to classification. Several of the classifiers use the RDP directly. These include the standard linear discriminant analysis (LDA), nearest neighbor classifiers, and a transvariation probabilities-based classification method that is natural in the RDP. Several reference directions can also be combined to create new coordinate systems in which arbitrary classifiers can be developed. We obtain increased confidence in the classification results by cycling through all possible reference pairs and computing a misclassification-based weighted accuracy. The classification results on several high-dimensional biomedical datasets are compared.