A C++ library for rapid development of efficient parallel dense linear algebra codes for multicore computers

Authors:
Peiyi Tang
Affiliations:
University of Arkansas at Little Rock, Little Rock, AR
Venue:
Proceedings of the 51st ACM Southeast Conference
Year:
2013

Citing 6
Cited 0

Programming with tiles

Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
SuperMatrix: a multithreaded runtime scheduling system for algorithms-by-blocks

Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Intel threading building blocks

Intel threading building blocks
Parallelizing dense and banded linear algebra libraries using SMPSs

Concurrency and Computation: Practice & Experience
The libflame Library for Dense Matrix Computations

IEEE Design & Test
Measuring the overhead of Intel C++ Concurrent Collections over Threading Building Blocks for Gauss–Jordan elimination

Concurrency and Computation: Practice & Experience

Quantified Score

Hi-index	0.00

Visualization

Abstract

To program parallel codes for task-based parallel execution, the full data dependency analysis between the tasks is required. However, finding all the flow, anti and output data dependencies is not an easy task for non-trial algorithms. In this paper, we present a simple C++ library that can analyze all the data dependencies between tasks and build the task graph automatically. Developing parallel dense linear algebra codes using our library and another two C++ libraries, Intel TBB and NICTA Armadillo, is simple and easy. The parallel codes developed by using our library are also efficient due to the efficient task scheduling of Intel TBB library and the fast matrix operations of NICTA Armadillo library.