Effective performance measurement and analysis of multithreaded applications
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Developing parallel programs: A design-oriented perspective
IWMSE '09 Proceedings of the 2009 ICSE Workshop on Multicore Software Engineering
Hi-index | 0.00 |
Multicore processors have become the CPU trend currently due to the fact that performance is hard to be gained by simply increasing clock rates, which had been true over the past decades in computer industry. Yet, multicore programming is still in its infant stage as programmers are not trained to write parallel programs and technology constraints require manual tuning to achieve high performance. We report our multicore programming experience via optimization techniques such as global memory coalescence and thread divergence avoidance with a detailed performance evaluation on a classical dot product application. After applying these optimization techniques, the dot product application achieves a speedup of 3.57 compared to its non-optimization counterpart. These techniques can be directly applied to other applications as dot product has been used in many scientific applications.