Heidi matrix: nearest neighbor driven high dimensional data visualization

  • Authors:
  • Soujanya Vadapalli;Kamalakar Karlapalem

  • Affiliations:
  • Centre for Data Engineering, IIIT-Hyderabad, India;Centre for Data Engineering, IIIT-Hyderabad, India

  • Venue:
  • Proceedings of the ACM SIGKDD Workshop on Visual Analytics and Knowledge Discovery: Integrating Automated Analysis with Interactive Exploration
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Identifying patterns in large high dimensional data sets is a challenge. As the number of dimensions increases, the patterns in the data sets tend to be more prominent in the subspaces than the original dimensional space. A system to facilitate presentation of such subspace oriented patterns in high dimensional data sets is required to understand the data. Heidi is a high dimensional data visualization system that captures and visualizes the closeness of points across various subspaces of the dimensions; thus, helping to understand the data. The core concept behind Heidi is based on prominence of patterns within the nearest neighbor relations between pairs of points across the subspaces. Given a d-dimensional data set as input, Heidi system generates a 2-D matrix represented as a color image. This representation gives insight into (i) how the clusters are placed with respect to each other, (ii) characteristics of placement of points within a cluster in all the subspaces and (iii) characteristics of overlapping clusters in various subspaces. A sample of results displayed and discussed in this paper illustrate how Heidi Visualization can be interpreted.