Solving Linear Systems on Vector and Shared Memory Computers
Solving Linear Systems on Vector and Shared Memory Computers
Iterative Methods for Sparse Linear Systems
Iterative Methods for Sparse Linear Systems
The potential of the cell processor for scientific computing
Proceedings of the 3rd conference on Computing frontiers
Computational Methods for Multiphase Flows in Porous Media (Computational Science and Engineering 2)
Computational Methods for Multiphase Flows in Porous Media (Computational Science and Engineering 2)
Parallel Programming in C with MPI and OpenMP
Parallel Programming in C with MPI and OpenMP
Optimization of sparse matrix-vector multiplication on emerging multicore platforms
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Event Tracing and Visualization for Cell Broadband Engine Systems
Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Solving Systems of Linear Equations on the CELL Processor Using Cholesky Factorization
IEEE Transactions on Parallel and Distributed Systems
Programming the Cell Processor: For Games, Graphics, and Computation
Programming the Cell Processor: For Games, Graphics, and Computation
Parallel Simulation of Oil Reservoirs on a Multi-core Stream Computer
Transactions on Computational Science III
A comparison of parallel gaussian elimination solvers for the computation of electrochemical battery models on the cell processor
Parallelization and Performance Analysis of an IMPES-based Oil-Water Reservoir Simulator
HPCC '09 Proceedings of the 2009 11th IEEE International Conference on High Performance Computing and Communications
High-order stencil computations on multicore clusters
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Perfomance Models for Blocked Sparse Matrix-Vector Multiplication Kernels
ICPP '09 Proceedings of the 2009 International Conference on Parallel Processing
CG-Cell: an NPB benchmark implementation on cell broadband engine
ICDCN'08 Proceedings of the 9th international conference on Distributed computing and networking
Hi-index | 0.00 |
Multicore accelerators are used today to supplement traditional superscalar processors in massively parallel computer nodes with extra floating-point computation power. This paper presents our parallelization and performance enhancement and evaluation of the conjugate gradient (CG) linear equation solver with enhanced matrix multiplication on the Cell Broadband Engine accelerator. The paper also compares the CG performance results on the Cell and two CG implementations on a computer with two quadcore Xeon processors, one with OpenMP and the other with OpenMPI. We also report the enhancements made on the CG code and performance analysis of CG on single and dual Cell Broadband Engine packages with 8 and 16 synergistic processing elements and on Xeon for heptadiagonal matrices, in particular to matrix multiplication and synchronization. We also report the communication and computation time breakdowns and the floating point operations per second ratio. Our parallel CG solver is shown to scale well with data size, grid dimensionality, and number of cores. Copyright © 2011 John Wiley & Sons, Ltd.