Optimizing Techniques for OpenCL Programs on Heterogeneous Platforms

  • Authors:
  • Slo-Li Chu;Chih-Chieh Hsiao

  • Affiliations:
  • Chung Yuan Christian University, Taiwan;Chung Yuan Christian University, Taiwan

  • Venue:
  • International Journal of Grid and High Performance Computing
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Heterogeneous platforms that are consisted of CPU and add-on streaming processors are widely used in modern computer systems. These add-on processors provide substantially more computation capability and memory bandwidth than conventional multi-cores platforms. General-purpose computations can also be leveraged onto these add-on processors. In order to utilize their potential performance, programming these streaming processors is challenging because of their diverse underlying architectural characteristics. Several optimization techniques are applied on OpenCL-compatible heterogeneous platforms to achieve thread-level, data-level, and instruction-level parallelism. The architectural implications of these techniques and optimization principles are discussed. Finally, a case study of MRI-Q benchmark will be addressed to illustrate to capabilities of these optimization techniques. The experimental results reveal the speedup from non-optimized to optimized kernel can vary from 8 to 63 on different target platforms.