Collaborative query coordination in community-driven data grids
Proceedings of the 18th ACM international symposium on High performance distributed computing
Low-power amdahl-balanced blades for data intensive computing
ACM SIGOPS Operating Systems Review
Communications of the ACM
An overview of the Open Science Data Cloud
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Future Generation Computer Systems
I/O streaming evaluation of batch queries for data-intensive computational turbulence
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Proceedings of the first annual workshop on High performance computing meets databases
Indemics: An interactive high-performance computing framework for data-intensive epidemic modeling
ACM Transactions on Modeling and Computer Simulation (TOMACS) - Special issue on simulation in complex service systems
Wimpy or brawny cores: A throughput perspective
Journal of Parallel and Distributed Computing
Hadoop GIS: a high performance spatial data warehousing system over mapreduce
Proceedings of the VLDB Endowment
Modeling and optimizing large-scale data flows
Future Generation Computer Systems
The Experience in Designing and Evaluating the High Performance Cluster Netuno
International Journal of Parallel Programming
Hi-index | 0.02 |
Data intensive computing presents a significant challenge for traditional supercomputing architectures that maximize FLOPS since CPU speed has surpassed IO capabilities of HPC systems and BeoWulf clusters. We present the architecture for a three tier commodity component cluster designed for a range of data intensive computations operating on petascale data sets named GrayWulf. The design goal is a balanced system in terms of IO performance and memory size, according to Amdahl's Laws. The hardware currently installed at JHU exceeds one petabyte of storage and has 0.5 bytes/sec of I/O and 1 byte of memory for each CPU cycle. The GrayWulf provides almost an order of magnitude better balance than existing systems. The paper covers its architecture and reference applications. The software design is presented in a companion paper.