Evaluation of Cache-based Superscalar and Cacheless Vector Architectures for Scientific Computations

  • Authors:
  • Leonid Oliker;Andrew Canning;Jonathan Carter;John Shalf;David Skinner;Ethier Ethier;Rupak Biswas;Jahed Djomehri;Rob Van der Wijngaart

  • Affiliations:
  • CRD/NERSC, Lawrence Berkeley National Laboratory, Berkeley, CA;CRD/NERSC, Lawrence Berkeley National Laboratory, Berkeley, CA;CRD/NERSC, Lawrence Berkeley National Laboratory, Berkeley, CA;CRD/NERSC, Lawrence Berkeley National Laboratory, Berkeley, CA;CRD/NERSC, Lawrence Berkeley National Laboratory, Berkeley, CA;Princeton University, NJ;NASA Ames Research Center, Moffett Field, CA;NASA Ames Research Center, Moffett Field, CA;NASA Ames Research Center, Moffett Field, CA

  • Venue:
  • Proceedings of the 2003 ACM/IEEE conference on Supercomputing
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

The growing gap between sustained and peak performance for scientific applications is a well-known problem in high end computing. The recent development of parallel vector systems offers the potential to bridge this gap for many computational science codes and deliver a substantial increase in comput-ing capabilities. This paper examines the intranode performance of the NEC SX-6 vector processor and the cache-based IBM Power3/4 superscalar architectures across a number of scientific computing areas. First, we present the performance of a microbenchmark suite that examines low-level machine characteristics. Next, we study the behavior of the NAS Parallel Benchmarks. Finally, we evaluate the performance of several scientific computing codes. Results demonstrate that the SX-6 achieves high performance on a large fraction of our applications and often significantly outperforms the cache-based architectures. However, certain applications are not easily amenable to vectorization and would require extensive algorithm and implementation reengineering to utilize the SX-6 effectively.