An overview of the BlueGene/L Supercomputer
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
High-density computing: a 240-processor Beowulf in one cubic meter
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
MegaProto: A Low-Power and Compact Cluster for High-Performance Computing
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 11 - Volume 12
MegaProto: 1 TFlops/10kW Rack Is Feasible Even with Only Commodity Technology
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Low-cost High-bandwidth Tree Network for PC Clusters based on Tagged-VLAN Technology
ISPAN '05 Proceedings of the 8th International Symposium on Parallel Architectures,Algorithms and Networks
Hi-index | 0.02 |
In our research project named "Mega-Scale Computing Based on Low-Power Technology and Workload Modeling", we have been developing a prototype cluster not based on ASIC or FPGA but instead only using commodity technology. Its packaging is extremely compact and dense, and its performance/power ratio is very high. Our previous prototype system named "MegaProto" demonstrated that one cluster unit, which consists of 16 commodity low-power processors, can be successfully implemented on just 1U height chassis and it is capable of up to 2.8 times higher performance/ power ratio than ordinary high-performance dual-Xeon 1U server units. We have improved MegaProto by replacing the CPU and enhancing the I/O performance. The new cluster unit named "MegaProto/E" with 16 Transmeta Efficeon processors achieves 32 GFlops of peak performance, which is 2.2- fold greater than that of the original one. The cluster unit is equipped with an independent dual network of Gigabit Ethernet, including dual 24-port switches. The maximum power consumption of the cluster unit is 320 W, which is comparable with that of today's high-end PC servers for high performance clusters. Performance evaluation using NPB kernels and HPL shows that the performance of MegaProto/E exceeds that of a dual-Xeon server in all the benchmarks, and its performance ratio ranges from 1.3 to 3.7. These results reveal that our solution of implementing a number of ultra low-power processors in compact packaging is an excellent way to achieve extremely high performance in applications with a certain degree of parallelism. We are now building a multi-unit cluster with 128 CPUs (8 units) to prove that this advantage still holds with higher scalability