Heidi matrix: nearest neighbor driven high dimensional data visualization

Authors:
Soujanya Vadapalli;Kamalakar Karlapalem
Affiliations:
Centre for Data Engineering, IIIT-Hyderabad, India;Centre for Data Engineering, IIIT-Hyderabad, India
Venue:
Proceedings of the ACM SIGKDD Workshop on Visual Analytics and Knowledge Discovery: Integrating Automated Analysis with Interactive Exploration
Year:
2009

Citing 4
Cited 1

Fast algorithms for projected clustering

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Subspace Selection for Clustering High-Dimensional Data

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
A Simple Yet Effective Data Clustering Algorithm

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
VISA: visual subspace clustering analysis

ACM SIGKDD Explorations Newsletter - Special issue on visual analytics

Heidi visualization of R-tree structures over high dimensional data

SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Identifying patterns in large high dimensional data sets is a challenge. As the number of dimensions increases, the patterns in the data sets tend to be more prominent in the subspaces than the original dimensional space. A system to facilitate presentation of such subspace oriented patterns in high dimensional data sets is required to understand the data. Heidi is a high dimensional data visualization system that captures and visualizes the closeness of points across various subspaces of the dimensions; thus, helping to understand the data. The core concept behind Heidi is based on prominence of patterns within the nearest neighbor relations between pairs of points across the subspaces. Given a d-dimensional data set as input, Heidi system generates a 2-D matrix represented as a color image. This representation gives insight into (i) how the clusters are placed with respect to each other, (ii) characteristics of placement of points within a cluster in all the subspaces and (iii) characteristics of overlapping clusters in various subspaces. A sample of results displayed and discussed in this paper illustrate how Heidi Visualization can be interpreted.