DASH-IO: an empirical study of flash-based IO for HPC

Authors:
Jiahua He;Jeffrey Bennett;Allan Snavely
Affiliations:
University of California, San Diego;University of California, San Diego;University of California, San Diego
Venue:
Proceedings of the 2010 TeraGrid Conference
Year:
2010

Citing 5
Cited 7

Design tradeoffs for SSD performance

ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
Gordon: using flash memory to build fast, power-efficient clusters for data-intensive applications

Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Understanding intrinsic characteristics and system implications of flash memory based solid state drives

Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Characterizing flash memory: anomalies, observations, and applications

Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
vNUMA: a virtual shared-memory multiprocessor

USENIX'09 Proceedings of the 2009 conference on USENIX Annual technical conference

DASH: a Recipe for a Flash-based Data Intensive Supercomputer

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
The pitfalls of deploying solid-state drive RAIDs

Proceedings of the 4th Annual International Conference on Systems and Storage
Data intensive analysis on the gordon high performance data and compute system

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Gordon: design, performance, and experiences deploying and supporting a data intensive supercomputer

Proceedings of the 1st Conference of the Extreme Science and Engineering Discovery Environment: Bridging from the eXtreme to the campus and beyond
FlashBlades: System architecture and applications

Proceedings of the 2nd Workshop on Architectures and Systems for Big Data
Triple-A: a Non-SSD based autonomic all-flash array for high performance storage systems

Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Modeling the aging process of flash storage by leveraging semantic I/O

Future Generation Computer Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

HPC applications are becoming more and more data-intensive as a function of ever-growing simulation sizes and burgeoning data-acquisition. Unfortunately, the storage hierarchy of the existing HPC architecture has a 5-order-of-magnitude latency gap between main memory and spinning disks and cannot respond to the new data challenge well. Flash-based SSDs (Solid State Disks) are promising to fill the gap with their 2-order-of-magnitude lower latency. However, since all the existing hardware and software were designed without flash in mind, the question is how to integrate the new technology into existing architectures. DASH is a new Teragrid resource aggressively leveraging flash technology (and also distributed shared memory technology) to fill the latency gap. To explore the potentials and issues of integrating flash into today's HPC systems, we swept a large parameter space by fast and reliable measurements to investigate varying design options. We here provide some lessons we learned and also suggestions for future architecture design. Our results show that performance can be improved by 9x with appropriate existing technologies and probably further improved by future ones.