Implementation and performance analysis of parallel conjugate gradient on the cell broadband engine

  • Authors:
  • F. N. Sibai;H. K. Kidwai

  • Affiliations:
  • Facility of Information Technology, UAE University, Al Ain, United Arab Emirates;Facility of Information Technology, UAE University, Al Ain, United Arab Emirates

  • Venue:
  • IBM Journal of Research and Development
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents our implementation of the method of parallel conjugate gradients (CGs) on the Cell Broadband Engine®(Cell/B.E.®). The solution of linear systems of equations is one of the most central-processing-unit-intensive steps in oil reservoir simulation applications and can greatly benefit from the multitude of single-instruction-multiple-data-capable synergistic processor element (SPE) cores in the Cell/B.E. processor. We assume that the linear system of equations is of standard form Ax = B, where A is a square sparse coefficient matrix. Several solvers exist with distinct advantages and disadvantages. When dealing with 1-D, 2-D, and 3-D reservoirs, the resulting coefficient matrix can be formulated as a banded matrix. This paper reports the implementation of the serial CG on the Cell/B.E. PowerPC®processor element (PPE) and the parallelization and performance analysis of CG across 1, 8, and 16 SPEs for tridiagonal (1-D reservoir grid), pentadiagonal (2-D reservoir grid), and heptadiagonal (3-D reservoir grid) matrices. Our implementation is shown to scale well with data size, grid dimensionality, and number of cores.