Configuring Large High-Performance Clusters at Lightspeed: A Case Study

Authors:
Philip M. Papadopoulos;Caroline A. Papadopoulos;Mason J. Katz;William J. Link;Greg Bruno
Affiliations:
The San Diego Supercomputer Center, University of California, San Diego La Jolla, CA 92093-0505;Physical Oceanography Research Division, Scripps Institution of Oceanography, University of California, San Diego La Jolla, CA 92093-0505;The San Diego Supercomputer Center, University of California, San Diego La Jolla, CA 92093-0505;The San Diego Supercomputer Center, University of California, San Diego La Jolla, CA 92093-0505;The San Diego Supercomputer Center, University of California, San Diego La Jolla, CA 92093-0505
Venue:
International Journal of High Performance Computing Applications
Year:
2004

Citing 6
Cited 2

A high-performance, portable implementation of the MPI message passing interface standard

Parallel Computing
A Case for NOW (Networks of Workstations)

IEEE Micro
NPACI Rocks: Tools and Techniques for Easily Deploying Manageable Linux Clusters

CLUSTER '01 Proceedings of the 3rd IEEE International Conference on Cluster Computing
Leveraging Standard Core Technologies to Programmatically Build Linux Cluster Appliances

CLUSTER '02 Proceedings of the IEEE International Conference on Cluster Computing
Performance of Various Computers Using Standard Linear Equations Software

Performance of Various Computers Using Standard Linear Equations Software
Automated empirical optimization of high performance floating point kernels

Automated empirical optimization of high performance floating point kernels

Federated grid clusters using service address routed optical networks

Future Generation Computer Systems
Parallelization of prime number generation using message passing interface

WSEAS Transactions on Computers

Quantified Score

Hi-index	0.00

Visualization

Abstract

Over a decade ago, the TOP500 list was started as a way to measure supercomputers by their sustained performance on a particular linear algebra benchmark. Once reserved for the exotic machines and extremely well-funded centers and laboratories, commodity clusters now make it possible for smaller groups to deploy and use high performance machines in their own laboratories. This paper describes a weekend activity where two existing 128-node commodity clusters were fused into a single 256-node cluster for the specific purpose of running the benchmark used to rank the machines in the TOP500 supercomputer list. The resulting metacluster sits on the November 2002 list at position 233. A key differentiator for this cluster is that it was assembled, in terms of its software, from the NPACI Rocks open-source cluster toolkit as downloaded from the public website. The toolkit allows non-cluster experts to deploy and run supercomputer-class machines in a matter of hours instead of weeks or months. With the exception of recompiling the University of Tennessee's Automatically Tuned Linear Algebra Subroutines (ATLAS) library with a recommended version of the GNU C compiler, this metacluster ran a "stock" Rocks distribution. Successful first-time deployment of the fused cluster was completed in a scant 6 h. Partitioning of the metacluster and restoration of the two 128-node clusters to their original configuration was completed in just over 40 min. This paper describes early (pre-weekend) benchmark activities to empirically determine reasonably good parameters for the High Performance Linpack (HPL) code on both Ethernet and Myrinet interconnects. It fully describes the physical layout of the machine, the description-based installation methods used in Rocks to re-deploy two independent clusters as a single cluster, and gives the benchmark results that were gathered over the 40-h period allotted for the complete experiment. In addition, we describe some of the on-line monitoring and measurement techniques that were employed during the experiment. Finally, we point out the issues uncovered with a commodity cluster of this size. The techniques presented in this paper truly bring supercomputers into the hands of the masses of computational scientists.