A Buffered-Mode MPI Implementation for the Cell BETM Processor

  • Authors:
  • Arun Kumar;Ganapathy Senthilkumar;Murali Krishna;Naresh Jayam;Pallav K. Baruah;Raghunath Sharma;Ashok Srinivasan;Shakti Kapoor

  • Affiliations:
  • Dept. of Mathematics and Computer Science, Sri Sathya Sai University, Prashanthi Nilayam, India;Dept. of Mathematics and Computer Science, Sri Sathya Sai University, Prashanthi Nilayam, India;Dept. of Mathematics and Computer Science, Sri Sathya Sai University, Prashanthi Nilayam, India;Dept. of Mathematics and Computer Science, Sri Sathya Sai University, Prashanthi Nilayam, India;Dept. of Mathematics and Computer Science, Sri Sathya Sai University, Prashanthi Nilayam, India;Dept. of Mathematics and Computer Science, Sri Sathya Sai University, Prashanthi Nilayam, India;Dept. of Computer Science, Florida State University.,;IBM, Austin,

  • Venue:
  • ICCS '07 Proceedings of the 7th international conference on Computational Science, Part I: ICCS 2007
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Cell Broadband EngineTMis a heterogeneous multi-core architecture developed by IBM, Sony and Toshiba. It has eight computation intensive cores (SPEs) with a small local memory, and a single PowerPC core. The SPEs have a total peak single precision performance of 204.8 Gflops/s, and 14.64 Gflops/s in double precision. Therefore, the Cell has a good potential for high performance computing. But the unconventional architecture makes it difficult to program. We propose an implementation of the core features of MPI as a solution to this problem. This can enable a large class of existing applications to be ported to the Cell. Our MPI implementation attains bandwidth up to 6.01 GB/s, and latency as small as 0.41 μs. The significance of our work is in demonstrating the effectiveness of intra-Cell MPI, consequently enabling the porting of MPI applications to the Cell with minimal effort.