BLAS Based on Block Data Structures

Authors:
Greg Henry
Affiliations:
-
Venue:
BLAS Based on Block Data Structures
Year:
1992

Citing 0
Cited 4

A Family of High-Performance Matrix Multiplication Algorithms

ICCS '01 Proceedings of the International Conference on Computational Sciences-Part I
Supermatrix out-of-order scheduling of matrix operations for SMP and multi-core architectures

Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
SuperMatrix: a multithreaded runtime scheduling system for algorithms-by-blocks

Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
An Algorithm-by-Blocks for SuperMatrix Band Cholesky Factorization

High Performance Computing for Computational Science - VECPAR 2008

Quantified Score

Hi-index	0.00

Visualization

Abstract

The optimization of the BLAS is discussed, with examples given for the IBM superscalar RISC S/6000. The approach suggested is to use block data structures based on store-by-block schemes. We give results and analysis of the optimization of DGEMM. We also suggest how these results can be applied to the higher level factorizations and the other BLAS. Results are given to show the advantages of using block data structures.