A Taxonomy of Visual Cluster Separation Factors

Authors:
M. Sedlmair;A. Tatu;T. Munzner;M. Tory
Affiliations:
University of British Columbia, Canada;University of Konstanz, Germany;University of British Columbia, Canada;University of Victoria, Canada
Venue:
Computer Graphics Forum
Year:
2012

Citing 11
Cited 1

Extraction of early perceptual structure in dot patterns: integrating region, boundary, and component gestalt

Computer Vision, Graphics, and Image Processing
Graph-Theoretic Scagnostics

INFOVIS '05 Proceedings of the Proceedings of the 2005 IEEE Symposium on Information Visualization
VizRank: Data Visualization Guided by Machine Learning

Data Mining and Knowledge Discovery
Glimmer: Multilevel MDS on the GPU

IEEE Transactions on Visualization and Computer Graphics
What does the user want to see?: what do the data want to be?

Information Visualization
Judging correlation from scatterplots and parallel coordinate plots

Information Visualization
Visual quality metrics and human perception: an initial study on 2D projections of large multidimensional data

Proceedings of the International Conference on Advanced Visual Interfaces
Confessions from a grounded theory PhD: experiences and lessons learnt

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Dimensionality reduction on multi-dimensional transfer functions for multi-channel volume data sets

Information Visualization - Special issue on selected papers from visualization and data analysis 2010
Quality Metrics in High-Dimensional Data Visualization: An Overview and Systematization

IEEE Transactions on Visualization and Computer Graphics
The perception of correlation in scatterplots

EuroVis'10 Proceedings of the 12th Eurographics / IEEE - VGTC conference on Visualization

Special Section on Visual Analytics: Visualization of cluster structure and separation in multivariate mixed data: A case study of diversity faultlines in work teams

Computers and Graphics

Quantified Score

Hi-index	0.00

Visualization

Abstract

We provide two contributions, a taxonomy of visual cluster separation factors in scatterplots, and an in-depth qualitative evaluation of two recently proposed and validated separation measures. We initially intended to use these measures to provide guidance for the use of dimension reduction (DR) techniques and visual encoding (VE) choices, but found that they failed to produce reliable results. To understand why, we conducted a systematic qualitative data study covering a broad collection of 75 real and synthetic high-dimensional datasets, four DR techniques, and three scatterplot-based visual encodings. Two authors visually inspected over 800 plots to determine whether or not the measures created plausible results. We found that they failed in over half the cases overall, and in over two-thirds of the cases involving real datasets. Using open and axial coding of failure reasons and separability characteristics, we generated a taxonomy of visual cluster separability factors. We iteratively refined its explanatory clarity and power by mapping the studied datasets and success and failure ranges of the measures onto the factor axes. Our taxonomy has four categories, ordered by their ability to influence successors: Scale, Point Distance, Shape, and Position. Each category is split into Within-Cluster factors such as density, curvature, isotropy, and clumpiness, and Between-Cluster factors that arise from the variance of these properties, culminating in the overarching factor of class separation. The resulting taxonomy can be used to guide the design and the evaluation of cluster separation measures. © 2012 Wiley Periodicals, Inc.