High performance multivariate visual data exploration for extremely large data

Authors:
Oliver Rübel; Prabhat;Kesheng Wu;Hank Childs;Jeremy Meredith;Cameron G. R. Geddes;Estelle Cormier-Michel;Sean Ahern;Gunther H. Weber;Peter Messmer;Hans Hagen;Bernd Hamann;E. Wes Bethel
Affiliations:
Lawrence Berkeley National Laboratory, Berkeley, CA and University of California, Davis, CA and Technische Universität Kaiserslautern, Kaiserslautern, Germany;Lawrence Berkeley National Laboratory, Berkeley, CA;Lawrence Berkeley National Laboratory, Berkeley, CA;Lawrence Livermore National Laboratory, Livermore, CA;Oak Ridge National Laboratory, Oak Ridge, TN;LOASIS program of Lawrence Berkeley National Laboratory, Berkeley, CA;LOASIS program of Lawrence Berkeley National Laboratory, Berkeley, CA;Oak Ridge National Laboratory, Oak Ridge, TN;Lawrence Berkeley National Laboratory, Berkeley, CA;Tech-X Corporation, Boulder, CO;Technische Universität Kaiserslautern, Kaiserslautern, Germany;Lawrence Berkeley National Laboratory, Berkeley, CA and University of California, Davis, CA and Technische Universität Kaiserslautern, Kaiserslautern, Germany;Lawrence Berkeley National Laboratory, Berkeley, CA and University of California, Davis, CA
Venue:
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Year:
2008

Citing 17
Cited 11

An overview of data warehousing and OLAP technology

ACM SIGMOD Record
Bitmap index design and evaluation

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Hierarchical parallel coordinates for exploration of large datasets

VIS '99 Proceedings of the conference on Visualization '99: celebrating ten years
An application architecture for large data visualization: a case study

PVG '01 Proceedings of the IEEE 2001 symposium on parallel and large-data visualization and graphics
Strategies for processing ad hoc queries on large data warehouses

Proceedings of the 5th ACM international workshop on Data Warehousing and OLAP
Model 204 Architecture and Performance

Proceedings of the 2nd International Workshop on High Performance Transaction Systems
Compressing Bitmap Indexes for Faster Search Operations

SSDBM '02 Proceedings of the 14th International Conference on Scientific and Statistical Database Management
VORPAL: a versatile plasma simulation code

Journal of Computational Physics
Parallel Coordinates: Visual Multidimensional Geometry and Its Applications

Parallel Coordinates: Visual Multidimensional Geometry and Its Applications
Revealing Structure within Clustered Parallel Coordinates Displays

INFOVIS '05 Proceedings of the Proceedings of the 2005 IEEE Symposium on Information Visualization
Using bitmap index for interactive exploration of large datasets

SSDBM '03 Proceedings of the 15th International Conference on Scientific and Statistical Database Management
Optimizing bitmap indices with efficient compression

ACM Transactions on Database Systems (TODS)
HDF5-FastQuery: Accelerating Complex Queries on HDF Datasets using Fast Bitmap Indices

SSDBM '06 Proceedings of the 18th International Conference on Scientific and Statistical Database Management
Outlier-Preserving Focus+Context Visualization in Parallel Coordinates

IEEE Transactions on Visualization and Computer Graphics
Detecting distributed scans using high-performance query-driven visualization

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
On the performance of bitmap indices for high cardinality attributes

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
REHIST: relative error histogram construction algorithms

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30

Terascale data organization for discovering multivariate climatic trends

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Supporting web-based visual exploration of large-scale raster geospatial data using binned min-max Quadtree

SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
Parallel index and query for large scale data analysis

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Federal market information technology in the post flash crash era: roles for supercomputing

Proceedings of the fourth workshop on High performance computational finance
Parallel I/O, analysis, and visualization of a trillion particle simulation

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Visualization for the Physical Sciences

Computer Graphics Forum
Scalable in situ scientific data encoding for analytical query processing

Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
GoldRush: resource efficient in situ scientific data analytics using fine-grained interference aware execution

SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
An analytical framework for particle and volume data of large-scale combustion simulations

UltraVis '13 Proceedings of the 8th International Workshop on Ultrascale Visualization
A classification of scientific visualization algorithms for massive threading

UltraVis '13 Proceedings of the 8th International Workshop on Ultrascale Visualization
imMens: real-time visual querying of big data

EuroVis '13 Proceedings of the 15th Eurographics Conference on Visualization

Quantified Score

Hi-index	0.00

Visualization

Abstract

One of the central challenges in modern science is the need to quickly derive knowledge and understanding from large, complex collections of data. We present a new approach that deals with this challenge by combining and extending techniques from high performance visual data analysis and scientific data management. This approach is demonstrated within the context of gaining insight from complex, time-varying datasets produced by a laser wakefield accelerator simulation. Our approach leverages histogram-based parallel coordinates for both visual information display as well as a vehicle for guiding a data mining operation. Data extraction and subsetting are implemented with state-of-the-art index/query technology. This approach, while applied here to accelerator science, is generally applicable to a broad set of science applications, and is implemented in a production-quality visual data analysis infrastructure. We conduct a detailed performance analysis and demonstrate good scalability on a distributed memory Cray XT4 system.