Scalable performance of the Panasas parallel file system
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
DataStager: scalable data staging services for petascale applications
Proceedings of the 18th ACM international symposium on High performance distributed computing
Architecting phase change memory as a scalable dram alternative
Proceedings of the 36th annual international symposium on Computer architecture
FAWN: a fast array of wimpy nodes
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
PLFS: a checkpoint filesystem for parallel applications
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Leveraging 3D PCRAM technologies to reduce checkpoint overhead for future exascale systems
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Enhancing lifetime and security of PCM-based main memory with start-gap wear leveling
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Design, Modeling, and Evaluation of a Scalable Multi-level Checkpointing System
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Moneta: A High-Performance Storage Array Architecture for Next-Generation, Non-volatile Memories
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Onyx: a protoype phase change memory storage array
HotStorage'11 Proceedings of the 3rd USENIX conference on Hot topics in storage and file systems
Hierarchical merge for scalable MapReduce
Proceedings of the 2012 workshop on Management of big data systems
Hi-index | 0.00 |
With HPC machines moving to the exascale, scaling the I/O performance of applications is a well known problem. Also, a closely related problem is, how to efficiently analyze and extract useful information from I/O data, viz. data post processing. With advent of nonvolatile memory technologies (NVMs) like SSD, PCM and Memristor, research has been focusing on how to improve the file systems performance and optimizations to overcome disk latencies. In the other end, there has been extensive focus on 'DataStaging' or 'in situ' I/O processing where I/O data are moved from computational cores to memory buffers of dedicated data processing or staging nodes using high performance I/O channels. The I/O data gets processed in these nodes before writing them to persistent storage like disks. However, issues with such approaches include (1) the limitation that they cannot easily analyze temporal data relationships or characteristics embedded in multiple simulation output steps, due to the limited aggregate memory capacity of staging nodes, and (2) the need to 'right size' such staging memory, sometimes even for single output/checkpoint steps when data volumes are large. Failing to properly allocate staging memory buffers (2) can cause applications to block and severely degrade the performance improvements sought by the extensive parallelization efforts undertaken by application developers. The limitation posed by (1) can degrade the utility of the Staging approach seen by end users. This paper explores an alternative solution for 'right memory sizing' issue for staging I/O. In this solution, memory scaling avoids the cost and power constraints imposed on machine designers by the use of DRAM (memory), by instead, using active NVRAM(nonvolatile memory) to enhance the memory capacities of compute and staging nodes. Active NVRAMs are node-local NVRAMs that are embedded with a low power system-on-chip compute element. We propose a mechanism, in which each physical node has an additional active NVRAM component to stage I/O and apply simple data analytics operations over the I/O data. While such node local data storage provides an obvious I/O acceleration, our experimental results show the effectiveness of our approach in addressing 'right memory sizing issue' by efficient I/O data processing. We also discuss the overheads in using Active NVRAM based approach for I/O staging.