MegaProto: 1 TFlops/10kW Rack Is Feasible Even with Only Commodity Technology

  • Authors:
  • Hiroshi Nakashima;Hiroshi Nakamura;Mitsuhisa Sato;Taisuke Boku;Satoshi Matsuoka;Daisuke Takahashi;Yoshihiko Hotta

  • Affiliations:
  • Toyohashi University of Technology;University of Tokyo;University of Tsukuba;University of Tsukuba;Tokyo Institute of Technology;University of Tsukuba;University of Tsukuba

  • Venue:
  • SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In our research project "Mega-Scale Computing Based on Low-Power Technology and Workload Modeling", we claim that a million-scale parallel system could be built with densely mounted low-power commodity processors. "MegaProto" is a proof-of-concept low-power and highperformance cluster build only with commodity components to implement this claim. A one-rack system is composed of 32 motherboard "cluster units" of 1 U-height and commodity switches to interconnect them mutually as well as with other racks. Each cluster unit houses 16 low-power dollarbill- sized commodity PC-architecture daughterboards, together with a high bandwidth, 2 Gbps per processor embedded switched network based on Gigabit Ethernet. The peak performance of a one-rack system is 0.48 TFlops for the first version and will improve to 1.02 TFlops in the second version through a processor/daughterboard upgrade. The system consumes about 10 kW or less per rack, resulting in 100 MFlops/W power efficiency with a power-aware intrarack network of 32 Gbps bisection bandwidth, while additional 2.4 kW will boost this to sufficiently large 256 Gbps. Performance studies show that even the first version significantly outperforms a conventional high-end 1U server comprised of dual power-hungry processors in a majority of NPB programs. It is also investigated how the current automated DVS control could save power for the HPC parallel programs along with its limitation.