Similarity-Dissimilarity Plot for Visualization of High Dimensional Data in Biomedical Pattern Classification

  • Authors:
  • Muhammad Arif

  • Affiliations:
  • Department of Computer Science and Engineering, Air University, Islamabad, Pakistan

  • Venue:
  • Journal of Medical Systems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In pattern classification problems, feature extraction is an important step. Quality of features in discriminating different classes plays an important role in pattern classification problems. In real life, pattern classification may require high dimensional feature space and it is impossible to visualize the feature space if the dimension of feature space is greater than four. In this paper, we have proposed a Similarity-Dissimilarity plot which can project high dimensional space to a two dimensional space while retaining important characteristics required to assess the discrimination quality of the features. Similarity-dissimilarity plot can reveal information about the amount of overlap of features of different classes. Separable data points of different classes will also be visible on the plot which can be classified correctly using appropriate classifier. Hence, approximate classification accuracy can be predicted. Moreover, it is possible to know about whom class the misclassified data points will be confused by the classifier. Outlier data points can also be located on the similarity-dissimilarity plot. Various examples of synthetic data are used to highlight important characteristics of the proposed plot. Some real life examples from biomedical data are also used for the analysis. The proposed plot is independent of number of dimensions of the feature space.