Cilk: an efficient multithreaded runtime system
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Pthreads programming
Journal of Parallel and Distributed Computing
High-Performance Library Software for QR Factorization
PARA '00 Proceedings of the 5th International Workshop on Applied Parallel Computing, New Paradigms for HPC in Industry and Academia
Parallel Gaussian Elimination Using OpenMP and MPI
HPCS '02 Proceedings of the 16th Annual International Symposium on High Performance Computing Systems and Applications
Out-of-Core Computation of the QR Factorization on Multi-core Processors
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Scheduling dense linear algebra operations on multicore processors
Concurrency and Computation: Practice & Experience
Fine tuning matrix multiplications on multicore
HiPC'08 Proceedings of the 15th international conference on High performance computing
Hi-index | 0.00 |
Basic matrix computations such as vector and matrix addition, dot product, outer product, matrix transpose, matrix - vector and matrix multiplication are very challenging computational kernels arising in scientific computing. In this paper, we parallelize those basic matrix computations using the multi-core and parallel programming tools. Specifically, these tools are Pthreads, OpenMP, Intel Cilk++, Intel TBB, Intel ArBB, SMPSs, SWARM and FastFlow. The purpose of this paper is to present an unified quantitative and qualitative study of these tools for parallel matrix computations on multicore. Finally, based on the performance results with compilation optimization we conclude that the Intel ArBB and SWARM parallel programming tools are the most appropriate because these give good performance and simplicity of programming. In particular, we conclude that the Intel ArBB is a good choice for implementing intensive computations such as matrix product because it gives significant speedup results over the serial implementation. On the other hand, the SWARM tool gives good performance results for implementing matrix operations of medium size such as vector addition, matrix addition, outer product and matrix - vector product.