Cluster rendering of skewed datasets via visualization

Authors:
Keke Chen;Ling Liu
Affiliations:
Georgia Institute of Technology, Atlanta, GA;Georgia Institute of Technology, Atlanta, GA
Venue:
Proceedings of the 2003 ACM symposium on Applied computing
Year:
2003

Citing 7
Cited 3

Algorithms for clustering data

Algorithms for clustering data
FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Interactive exploration of very large relational datasets through 3D dynamic projections

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Visualizing multi-dimensional clusters, trends, and outliers using star coordinates

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Visualizing Data

Visualizing Data
Feature Extraction, Construction and Selection: A Data Mining Perspective

Feature Extraction, Construction and Selection: A Data Mining Perspective

Validating and Refining Clusters via Visual Rendering

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
ClusterMap: labeling clusters in large datasets via visualization

Proceedings of the thirteenth ACM international conference on Information and knowledge management
VISTA: validating and refining clusters via visualization

Information Visualization

Quantified Score

Hi-index	0.00

Visualization

Abstract

Information Visualization is commonly recognized as a useful method for understanding sophistication in large datasets. In this paper, we introduce a flexible clustering approach with visualization techniques, aiming at the datasets that have skewed cluster distribution. This paper has three contributions. First, we propose a framework Vista that incorporates information visualization methods into the clustering process in order to enhance the understanding of the intermediate clustering results and allow user to revise the clustering results easily. Second, we develop a visualization model that maps multidimensional dataset to 2D visualizations while preserving or partially preserving clusters. Third, based on the visualization model, a set of operating rules are proposed to guide the user rendering clusters efficiently. Experiments show that the Vista system can yield lower error rates for real datasets than typical automated algorithms.