Implementation of mixed precision in solving systems of linear equations on the Cell processor: Research Articles

Authors:
Jakub Kurzak;Jack Dongarra
Affiliations:
Department of Computer Science, University Tennessee, Knoxville, TN 37996, U.S.A.;Department of Computer Science, University Tennessee, Knoxville, TN 37996, U.S.A. and Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, U.S.A.
Venue:
Concurrency and Computation: Practice & Experience
Year:
2007

Citing 0
Cited 11

Mixed Precision Iterative Refinement Techniques for the Solution of Dense Linear Systems

International Journal of High Performance Computing Applications
Optimizing large scale chemical transport models for multicore platforms

Proceedings of the 2008 Spring simulation multiconference
On the Implementation of Boundary Element Engineering Codes on the Cell Broadband Engine

High Performance Computing for Computational Science - VECPAR 2008
Implementing a parallel matrix factorization library on the cell broadband engine

Scientific Programming - High Performance Computing with the Cell Broadband Engine
QR factorization for the Cell Broadband Engine

Scientific Programming - High Performance Computing with the Cell Broadband Engine
Optimizing matrix multiplication for a short-vector SIMD architecture - CELL processor

Parallel Computing
Accuracy and performance of single versus double precision arithmetics for maximum likelihood phylogeny reconstruction

PPAM'09 Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part II
Scalable heterogeneous parallelism for atmospheric modeling and simulation

The Journal of Supercomputing
Cache blocking

PARA'10 Proceedings of the 10th international conference on Applied Parallel and Scientific Computing - Volume Part I
Cache blocking for linear algebra algorithms

PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part I
A Simple Compressive Sensing Algorithm for Parallel Many-Core Architectures

Journal of Signal Processing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes the design concepts behind implementations of mixed-precision linear algebra routines targeted for the Cell processor. It describes in detail the implementation of code to solve linear system of equations using Gaussian elimination in single precision with iterative refinement of the solution to the full double-precision accuracy. By utilizing this approach the algorithm achieves close to an order of magnitude higher performance on the Cell processor than the performance offered by the standard double-precision algorithm. The code is effectively an implementation of the high-performance LINPACK benchmark, as it meets all of the requirements concerning the problem being solved and the numerical properties of the solution. Copyright © 2007 John Wiley & Sons, Ltd.