Guided discovery of interesting relationships between time series clusters and metadata properties

Authors:
Jürgen Bernard;Tobias Ruppert;Maximilian Scherer;Tobias Schreck;Jörn Kohlhammer
Affiliations:
Fraunhofer Institute for Computer Graphics Research, Fraunhoferstr., Darmstadt, Germany;Fraunhofer Institute for Computer Graphics Research, Fraunhoferstr., Darmstadt, Germany;TU Darmstadt, Fraunhoferstr., Darmstadt, Germany;University of Konstanz, Universitätsstr., Konstanz, Germany;Fraunhofer Institute for Computer Graphics Research, Fraunhoferstr., Darmstadt, Germany
Venue:
Proceedings of the 12th International Conference on Knowledge Management and Knowledge Technologies
Year:
2012

Citing 15
Cited 0

Hierarchical parallel coordinates for exploration of large datasets

VIS '99 Proceedings of the conference on Visualization '99: celebrating ten years
The Earth Mover's Distance as a Metric for Image Retrieval

International Journal of Computer Vision
Visual Explorations in Finance

Visual Explorations in Finance
Self-Organizing Maps

Self-Organizing Maps
On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration

Data Mining and Knowledge Discovery
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
Graph-Theoretic Scagnostics

INFOVIS '05 Proceedings of the Proceedings of the 2005 IEEE Symposium on Information Visualization
Interestingness measures for data mining: A survey

ACM Computing Surveys (CSUR)
A Visualization System for Space-Time and Multivariate Patterns (VIS-STAMP)

IEEE Transactions on Visualization and Computer Graphics
Enhancing cluster labeling using wikipedia

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Clustering of time series data-a survey

Pattern Recognition
Automated Analytical Methods to Support Visual Exploration of High-Dimensional Data

IEEE Transactions on Visualization and Computer Graphics
Retrieval and exploratory search in multivariate research data repositories using regressional features

Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
DICON: Interactive Visual Analysis of Multidimensional Clusters

IEEE Transactions on Visualization and Computer Graphics
Content-based layouts for exploratory metadata search in scientific research data

Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries

Quantified Score

Hi-index	0.00

Visualization

Abstract

Visual cluster analysis provides valuable tools that help analysts to understand large data sets in terms of representative clusters and relationships thereof. Often, the found clusters are to be understood in context of belonging categorical, numerical or textual metadata which are given for the data elements. While often not part of the clustering process, such metadata play an important role and need to be considered during the interactive cluster exploration process. Traditionally, linked-views allow to relate (or loosely speaking: correlate) clusters with metadata or other properties of the underlying cluster data. Manually inspecting the distribution of metadata for each cluster in a linked-view approach is tedious, especially for large data sets, where a large search problem arises. Fully interactive search for potentially useful or interesting cluster to metadata relationships may constitute a cumbersome and long process. To remedy this problem, we propose a novel approach for guiding users in discovering interesting relationships between clusters and associated metadata. Its goal is to guide the analyst through the potentially huge search space. We focus in our work on metadata of categorical type, which can be summarized for a cluster in form of a histogram. We start from a given visual cluster representation, and compute certain measures of interestingness defined on the distribution of metadata categories for the clusters. These measures are used to automatically score and rank the clusters for potential interestingness regarding the distribution of categorical metadata. Identified interesting relationships are highlighted in the visual cluster representation for easy inspection by the user. We present a system implementing an encompassing, yet extensible, set of interestingness scores for categorical metadata, which can also be extended to numerical metadata. Appropriate visual representations are provided for showing the visual correlations, as well as the calculated ranking scores. Focusing on clusters of time series data, we test our approach on a large real-world data set of time-oriented scientific research data, demonstrating how specific interesting views are automatically identified, supporting the analyst discovering interesting and visually understandable relationships.