Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
SuperMatrix: a multithreaded runtime scheduling system for algorithms-by-blocks
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Intel threading building blocks
Intel threading building blocks
Parallelizing dense and banded linear algebra libraries using SMPSs
Concurrency and Computation: Practice & Experience
The libflame Library for Dense Matrix Computations
IEEE Design & Test
Concurrency and Computation: Practice & Experience
Hi-index | 0.00 |
To program parallel codes for task-based parallel execution, the full data dependency analysis between the tasks is required. However, finding all the flow, anti and output data dependencies is not an easy task for non-trial algorithms. In this paper, we present a simple C++ library that can analyze all the data dependencies between tasks and build the task graph automatically. Developing parallel dense linear algebra codes using our library and another two C++ libraries, Intel TBB and NICTA Armadillo, is simple and easy. The parallel codes developed by using our library are also efficient due to the efficient task scheduling of Intel TBB library and the fast matrix operations of NICTA Armadillo library.