Quantitative performance analysis of the SPEC OMPM2001 benchmarks

Authors:
Vishal Aslot;Rudolf Eigenmann
Affiliations:
Purdue University, Department of Electrical and Computer Engineering, 2600 Gracy Farms Lane, Austin, Texas 78758, USA. Tel.: +1 512 833 5545/ Fax: +1 512 838 3484/ E-mail: vaslot@ecn.purdue.edu;Purdue University, Department of Electrical and Computer Engineering, 1285 Electrical Engineering Building, West Lafayette, Indiana 47907, USA. Tel.: +1 765 494 1741/ Fax: +1 765 494 6440/ E-mail: ...
Venue:
Scientific Programming - OpenMP
Year:
2003

Citing 10
Cited 11

Algorithms for scalable synchronization on shared-memory multiprocessors

ACM Transactions on Computer Systems (TOCS)
Public international benchmarks for parallel computers: PARKBENCH committee: Report-1

Scientific Programming
The SPLASH-2 programs: characterization and methodological considerations

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
An evaluation of memory consistency models for shared-memory systems with ILP processors

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Efficient synchronization: let them eat QOLB

Proceedings of the 24th annual international symposium on Computer architecture
Parallel programming in OpenMP

Parallel programming in OpenMP
Basic Linear Algebra Subprograms for Fortran Usage

ACM Transactions on Mathematical Software (TOMS)
Performance characteristics of the SPEC OMP2001 benchmarks

ACM SIGARCH Computer Architecture News - Special Issue: PACT 2001 workshops
Techniques for Optimizing Applications: High Performance Computing

Techniques for Optimizing Applications: High Performance Computing
A Hierarchical Approach to Modeling and Improving the Performance of Scientific Applications on the KSR1

ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 03

Cherry-MP: Correctly Integrating Checkpointed Early Resource Recycling in Chip Multiprocessors

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Power-performance considerations of parallel computing on chip multiprocessors

ACM Transactions on Architecture and Code Optimization (TACO)
Core fusion: accommodating software diversity in chip multiprocessors

Proceedings of the 34th annual international symposium on Computer architecture
Self-Optimizing Memory Controllers: A Reinforcement Learning Approach

ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Architecting phase change memory as a scalable dram alternative

Proceedings of the 36th annual international symposium on Computer architecture
Phase change memory architecture and the quest for scalability

Communications of the ACM
Elastic cooperative caching: an autonomous dynamically adaptive memory hierarchy for chip multiprocessors

Proceedings of the 37th annual international symposium on Computer architecture
SPEC OpenMP benchmarks on four generations of NEC SX parallel vector systems

IWOMP'05/IWOMP'06 Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming
Memory Performance And SPEC OpenMP scalability on quad-socket x86 64 systems

ICA3PP'11 Proceedings of the 11th international conference on Algorithms and architectures for parallel processing - Volume Part I
Understanding and mitigating refresh overheads in high-density DDR4 DRAM systems

Proceedings of the 40th Annual International Symposium on Computer Architecture
Improving memory scheduling via processor-side load criticality information

Proceedings of the 40th Annual International Symposium on Computer Architecture

Quantified Score

Hi-index	0.02

Visualization

Abstract

The state of modern computer systems has evolved to allow easy access to multiprocessor systems by supporting multiple processors on a single physical package. As the multiprocessor hardware evolves, new ways of programming it are also developed. Some inventions may merely be adopting and standardizing the older paradigms. One such evolving standard for programming shared-memory parallel computers is the OpenMP API. The Standard Performance Evaluation Corporation (SPEC) has created a suite of parallel programs called SPEC OMP to compare and evaluate modern shared-memory multiprocessor systems using the OpenMP standard. We have studied these benchmarks in detail to understand their performance on a modern architecture. In this paper, we present detailed measurements of the benchmarks. We organize, summarize, and display our measurements using a Quantitative Model. We present a detailed discussion and derivation of the model. Also, we discuss the important loops in the SPEC OMPM2001 benchmarks and the reasons for less than ideal speedup on our platform.