Benchmarking GPUs to tune dense linear algebra
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units
Mapping the LU decomposition on a many-core architecture: challenges and solutions
Proceedings of the 6th ACM conference on Computing frontiers
OpenMP tasking analysis for programmers
CASCON '09 Proceedings of the 2009 Conference of the Center for Advanced Studies on Collaborative Research
Hi-index | 0.00 |
The ubiquity of multi-core processors in commodity computing systems has raised a significant programming challenge for their effective use. As multi-core processors with tens or hundreds of cores begin to proliferate, system optimization issues once faced by the high-performance computing (HPC) community will become important to all programmers. The focus of the multi-core programmer will be on productivity, portability as well as performance. We discuss in this paper mainly the performance issues involved in developing the software on the multi-core platform. This will help the developer in fine tuning his application on multi-core architecture. The paper studies a way to incorporate data locality problem into the parallel programming system using tiling approach coupled with OpenMP. The paper also shows that, there is a considerable performance enhancement with this approach when used in the algorithm.