A new array format for symmetric and triangular matrices

  • Authors:
  • John A. Gunnels;Fred G. Gustavson

  • Affiliations:
  • IBM T.J. Watson Research Center, Yorktown Heights, NY;IBM T.J. Watson Research Center, Yorktown Heights, NY

  • Venue:
  • PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

We introduce a new data format for storing triangular and symmetric matrices called HFP (Hybrid Full Packed). The standard two dimensional arrays of Fortran and C (also known as full format) that are used to store triangular and symmetric matrices waste half the storage space but provide high performance via the use of level 3 BLAS. Packed format arrays fully utilize storage (array space) but provide low performance as there are no level 3 packed BLAS. We combine the good features of packed and full storage using HFP format to obtain high performance using L3 (Level 3) BLAS as HFP is totally full format. Also, HFP format requires exactly the same minimal storage as packed storage. Each LAPACK full and/or packed symmetric/triangular routine becomes a single new HFP routine. LAPACK has some 125 times two such symmetric/triangular routines and hence some 125 new HFP routines can be produced. These new routines are trivial to produce as they merely consist of calls to existing LAPACK routines and Level 3 BLAS. Data format conversion routines between the conventional formats and HFP are briefly discussed as these routines make existing codes compatible with the new HFP format. We use the LAPACK routines for Cholesky factorization and inverse computation to illustrate this new work and to describe its performance. Performance of HPF verses LAPACK full routines is slightly better while using half the storage. Performance is roughly one to seven times faster for LAPACK packed routines while using the same storage. Performance runs were done on the IBM Power 4 using only existing LAPACK routines and ESSL level 3 BLAS.