Mapping nominal values to numbers for effective visualization

Authors:
Geraldine E. Rosario;Elke A. Rundensteiner;David C. Brown;Matthew O. Ward;Shiping Huang
Affiliations:
Computer Science Department, Worcester Polytechnic Institute, Worcester;Computer Science Department, Worcester Polytechnic Institute, Worcester;Computer Science Department, Worcester Polytechnic Institute, Worcester;Computer Science Department, Worcester Polytechnic Institute, Worcester;Computer Science Department, Worcester Polytechnic Institute, Worcester
Venue:
Information Visualization - Special issue of selected and extended InfoVis 03 papers
Year:
2004

Citing 16
Cited 9

Applied multivariate statistical analysis

Applied multivariate statistical analysis
Tree visualization with tree-maps: 2-d space-filling approach

ACM Transactions on Graphics (TOG)
The table lens: merging graphical and symbolic representations in an interactive focus + context visualization for tabular information

CHI '94 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Data mining: concepts and techniques

Data mining: concepts and techniques
Fast ordering of large categorical datasets for better visualization

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
A preprocessing scheme for high-cardinality categorical attributes in classification and prediction problems

ACM SIGKDD Explorations Newsletter
SPSS Categories 10.0

SPSS Categories 10.0
Similarity Clustering of Dimensions for an Enhanced Visualization of Multidimensional Data

INFOVIS '98 Proceedings of the 1998 IEEE Symposium on Information Visualization
Visualizing categorical data in ViSta

Computational Statistics & Data Analysis - Data visualization
Effect ordering for data displays

Computational Statistics & Data Analysis - Data visualization
Table lens as a tool for making sense of data

AVI '96 Proceedings of the workshop on Advanced visual interfaces
Exploring N-dimensional databases

VIS '90 Proceedings of the 1st conference on Visualization '90
Parallel coordinates: a tool for visualizing multi-dimensional geometry

VIS '90 Proceedings of the 1st conference on Visualization '90
Tree-Maps: a space-filling approach to the visualization of hierarchical information structures

VIS '91 Proceedings of the 2nd conference on Visualization '91
Semiology of graphics

Semiology of graphics
Mapping nominal values to numbers for effective visualization

INFOVIS'03 Proceedings of the Ninth annual IEEE conference on Information visualization

Bioinformatic Insights from Metagenomics through Visualization

CSB '05 Proceedings of the 2005 IEEE Computational Systems Bioinformatics Conference
DataMeadow: a visual canvas for analysis of large-scale multivariate data

Information Visualization - Special issue on visual analytics science and technology
Using 2D Hierarchical Heavy Hitters to Investigate Binary Relationships

Visual Data Mining
Visual exploration of categorical and mixed data sets

Proceedings of the ACM SIGKDD Workshop on Visual Analytics and Knowledge Discovery: Integrating Automated Analysis with Interactive Exploration
Visual analysis of mixed data sets using interactive quantification

ACM SIGKDD Explorations Newsletter
Profiler: integrated statistical analysis and visualization for data quality assessment

Proceedings of the International Working Conference on Advanced Visual Interfaces
StratomeX: Visual Analysis of Large-Scale Heterogeneous Genomics Data for Cancer Subtype Characterization

Computer Graphics Forum
Discovering diverse association rules from multidimensional schema

Expert Systems with Applications: An International Journal
Data guided approach to generate multi-dimensional schema for targeted knowledge discovery

AusDM '12 Proceedings of the Tenth Australasian Data Mining Conference - Volume 134

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data sets with a large numbers of nominal variables, including some with large number of distinct values, are becoming increasingly common and need to be explored. Unfortunately, most existing visual exploration tools are designed to handle numeric variables only. When importing data sets with nominal values into such visualization tools, most solutions to date are rather simplistic. Often, techniques that map nominal values to numbers do not assign order or spacing among the values in a manner that conveys semantic relationships. Moreover, displays designed for nominal variables usually cannot handle high cardinality variables well. This paper addresses the problem of how to display nominal variables in general-purpose visual exploration tools designed for numeric variables. Specifically, we investigate (1) how to assign order and spacing among the nominal values, and (2) how to reduce the number of distinct values to display. We propose a new technique, called the Distance-Quantification-Classing (DQC) approach, to preprocess nominal variables before being imported into a visual exploration tool. In the Distance Step, we identify a set of independent dimensions that can be used to calculate the distance between nominal values. In the Quantification Step, we use the independent dimensions and the distance information to assign order and spacing among the nominal values. In the Classing Step, we use results from the previous steps to determine which values within the domain of a variable are similar to each other and thus can be grouped together. Each step in the DQC approach can be accomplished by a variety of techniques. We extended the XmdvTool package to incorporate this approach. We evaluated our approach on several data sets using a variety of measures.