Static coarse grain task scheduling with cache optimization using OpenMP

  • Authors:
  • Hirofumi Nakano;Kazuhisa Ishizaka;Motoki Obata;Keiji Kimura;Hironori Kasahara

  • Affiliations:
  • Waseda University, 3-4-1 Ohkubo, Shinjuku-ku, Tokyo, 169-8555, Japan;Waseda University & Japanese Millennium Project IT 21 Advanced Parallelizing Compiler Project;Waseda University & Japanese Millennium Project IT 21 Advanced Parallelizing Compiler Project;Waseda University & Japanese Millennium Project IT 21 Advanced Parallelizing Compiler Project;Waseda University & Japanese Millennium Project IT 21 Advanced Parallelizing Compiler Project

  • Venue:
  • International Journal of Parallel Programming - Special issue: OpenMP: Experiences and implementations
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Effective use of cache memory is getting more important with increasing gap between the processor speed and memory access speed. Also, use of multigrain parallelism is getting more important to improve effective performance beyond the limitation of loop iteration level parallelism. Considering these factors, this paper proposes a coarse grain task static scheduling scheme considering cache optimization. The proposed scheme schedules coarse grain tasks to threads so that shared data among coarse grain tasks can be passed via cache after task and data decomposition considering cache size at compile time. It is implemented on OSCAR Fortran multigrain parallelizing compiler and evaluated on Sun Ultra80 four-processor SMP workstation using Swim and Tomcatv from the SPEC fp 95. As the results, the proposed scheme gives us 4.56 times speedup for Swim and 2.37 times on 4 processors for Tomcatv respectively against the Sun Forte HPC Ver. 6 update 1 loop parallelizing compiler.