Programming FFT on DSM Multiprocessors

  • Authors:
  • Hongzhang Shan;Jianhua Feng;Hongzhong Shan

  • Affiliations:
  • -;-;-

  • Venue:
  • HPC '00 Proceedings of the The Fourth International Conference on High-Performance Computing in the Asia-Pacific Region-Volume 2 - Volume 2
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

The performance of the shared address space-programming model for the kinds of coarse-grained communicating programs, which have traditionally been common in scientific computing, is not clear today. In this paper, we use the challenging 1-dimensional FFT, a regular coarse-grained program, as our driving application to study how to get high performance for such kind of applications under the shared address space-programming model on hardware supported cache-coherent distributed memory machine. We find that its performance is highly affected by the data placement. Proper data placement will be critical to the success of this kind of applications. Prefetching could further improve the performance to a degree of 10 percent to 50 percent for the data sets we studied. Naive programming will easily cause the performance bottleneck by introducing much more contention and lead to great performance loss. If the shared address space programs are properly programmed, it will deliver much better performance than the other popular programming models, such as MPI and SHMEM.