Speedup Versus Efficiency in Parallel Systems

Authors:
D. L. Eager;J. Zahorjan;E. D. Lozowska
Affiliations:
Univ. of Saskatchewan, Saskatoon, Canada;Univ. of Washington, Seattle;Univ. of Washington, Seattle
Venue:
IEEE Transactions on Computers
Year:
1989

Citing 5
Cited 90

The Manchester prototype dataflow computer

Communications of the ACM - Special section on computer architecture
A Survey of Parallel Machine Organization and Programming

ACM Computing Surveys (CSUR)
The Operational Analysis of Queueing Network Models

ACM Computing Surveys (CSUR)
On the Execution of Programs by Many Processors

Performance '83 Proceedings of the 9th International Symposium on Computer Performance Modelling, Measurement and Evaluation
Modelling and analysis of distributed software systems

SOSP '79 Proceedings of the seventh ACM symposium on Operating systems principles

Optimal on-line load balancing

SPAA '89 Proceedings of the first annual ACM symposium on Parallel algorithms and architectures
Analysis of computation-communication issues in dynamic dataflow architectures

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Characterizations of parallelism in applications and their use in scheduling

SIGMETRICS '89 Proceedings of the 1989 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
A performance evaluation of a general parallel processing model

SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Processor scheduling in shared memory multiprocessors

SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Indeterminate behavior with determinate semantics in parallel programs

FPCA '89 Proceedings of the fourth international conference on Functional programming languages and computer architecture
Scalability of parallel machines

Communications of the ACM
Processor-pool-based scheduling for large-scale NUMA multiprocessors

SIGMETRICS '91 Proceedings of the 1991 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Analysis of scalability of parallel algorithms and architectures: a survey

ICS '91 Proceedings of the 5th international conference on Supercomputing
Cost-performance analysis of heterogeneity in supercomputer architectures

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Another view on parallel speedup

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
The Processor Working Set and its Use in Scheduling Multiprocessor Systems

IEEE Transactions on Software Engineering
What is scalability?

ACM SIGARCH Computer Architecture News
On Parallel Processing Systems: Amdahl's Law Generalized and Some Results on Optimal Design

IEEE Transactions on Software Engineering
Steps Toward Architecture-Independent Image Processing

Computer
Further results using the overhead model for parallel systems

IBM Journal of Research and Development
Willow: a scalable shared memory multiprocessor

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Use of application characteristics and limited preemption for run-to-completion parallel processor scheduling policies

SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Bounds on the speedup and efficiency of partial synchronization in parallel processing systems

Journal of the ACM (JACM)
A Hierarchical Task Queue Organization for Shared-Memory Multiprocessor Systems

IEEE Transactions on Parallel and Distributed Systems
Performance and Scalability of Preconditioned Conjugate Gradient Methods on Parallel Computers

IEEE Transactions on Parallel and Distributed Systems
Coordinated allocation of memory and processors in multiprocessors

Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Guaranteeing Good Memory Bounds for Parallel Programs

IEEE Transactions on Software Engineering
Processor Saving Scheduling Policies for Multiprocessor Systems

IEEE Transactions on Computers
Preemptive scheduling of parallel jobs on multiprocessors

Proceedings of the seventh annual ACM-SIAM symposium on Discrete algorithms
Workload Execution Strategies and Parallel Speedup on Clustered Computers

IEEE Transactions on Computers
Scheduling multithreaded computations by work stealing

Journal of the ACM (JACM)
Space Efficient Execution of Deterministic Parallel Programs

IEEE Transactions on Software Engineering
Scal-Tool: pinpointing and quantifying scalability bottlenecks in DSM multiprocessors

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Characteristics of scalability and their impact on performance

Proceedings of the 2nd international workshop on Software and performance
An operational semantics for parallel lazy evaluation

ICFP '00 Proceedings of the fifth ACM SIGPLAN international conference on Functional programming
High-performance computer architecture and algorithm simulator

Journal on Educational Resources in Computing (JERIC)
Models of Parallel Applications with Large Computation and I/O Requirements

IEEE Transactions on Software Engineering
Integrated Performance Models for SPMD Applications and MIMD Architectures

IEEE Transactions on Parallel and Distributed Systems
A parallel algorithm for Lagrange interpolation on the star graph

Journal of Parallel and Distributed Computing
A parallel workload model and its implications for processor allocation

Cluster Computing
Paging tradeoffs in distributed-shared-memory multiprocessors

Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Parallelising large irregular programs: an experience with Naira

Information Sciences—Informatics and Computer Science: An International Journal - Special issue: Software engineering: Systems and tools
Modeling Speedup (n) Greater than n

IEEE Transactions on Parallel and Distributed Systems
The Effect of Scheduling Discipline on Spin Overhead in Shared Memory Parallel Systems

IEEE Transactions on Parallel and Distributed Systems
Cost and Time-Cost Effectiveness of Multiprocessing

IEEE Transactions on Parallel and Distributed Systems
The Scalability of FFT on Parallel Computers

IEEE Transactions on Parallel and Distributed Systems
Lower and Upper Bounds on Time for Multiprocessor Optimal Schedules

IEEE Transactions on Parallel and Distributed Systems
Integrated Performance Models for SPMD Applications and MIMD Architectures

IEEE Transactions on Parallel and Distributed Systems
Using moldability to improve the performance of supercomputer jobs

Journal of Parallel and Distributed Computing
When the Herd Is Smart: Aggregate Behavior in the Selection of Job Request

IEEE Transactions on Parallel and Distributed Systems
Adaptive Scheduling for Master-Worker Applications on the Computational Grid

GRID '00 Proceedings of the First IEEE/ACM International Workshop on Grid Computing
Parallel Models and Job Characterization for System Scheduling

ICCS '01 Proceedings of the International Conference on Computational Science-Part II
Improving Processor Allocation through Run-Time Measured Efficiency

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
A Model for Moldable Supercomputer Jobs

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
A Dynamic Periodicity Detector: Application to Speedup Computation

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Compiler Synthesis of Task Graphs for Parallel Program Performance Prediction

LCPC '00 Proceedings of the 13th International Workshop on Languages and Compilers for Parallel Computing-Revised Papers
A Tool to Schedule Parallel Applications on Multiprocessors: The NANOS CPU MANAGER

IPDPS '00/JSSPP '00 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Towards an Operational Semantics for a Parallel Non-Strict Functional Language

IFL '98 Selected Papers from the 10th International Workshop on 10th International Workshop
Parallel Job Scheduling: A Performance Perspective

Performance Evaluation: Origins and Directions
Parallel ray tracing on a chip

Practical parallel rendering
Task scheduling with locality consideration for a clustered parallel FL reduction system

PAS '95 Proceedings of the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis
Supercompilers for massively parallel architectures

PAS '95 Proceedings of the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis
On-line scheduling of scalable real-time tasks on multiprocessor systems

Journal of Parallel and Distributed Computing
Parallel program performance prediction using deterministic task graph analysis

ACM Transactions on Computer Systems (TOCS)
Parallel Polynomial Root Extraction on A Ring of Processors

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 15 - Volume 16
Parallel implementation of a transportation network model

Journal of Parallel and Distributed Computing
Performance-Driven Processor Allocation

IEEE Transactions on Parallel and Distributed Systems
Application Representations for Multiparadigm Performance Modeling of Large-Scale Parallel Scientific Codes

International Journal of High Performance Computing Applications
Adaptive scheduling with parallelism feedback

Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Adaptive work stealing with parallelism feedback

Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Self-adaptive applications on the grid

Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
$P$^$3$$T+$: A performance estimator for distributed and parallel programs

Scientific Programming
Performance-driven processor allocation

OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
The performance of synchronous parallel polynomial root extraction on a ring multicomputer

Cluster Computing
Speedup and scalability analysis of Master--Slave applications on large heterogeneous clusters

Journal of Parallel and Distributed Computing
Adaptive work-stealing with parallelism feedback

ACM Transactions on Computer Systems (TOCS)
Towards resilient high performance applications through real time reliability metric generation and autonomous failure correction

Proceedings of the 2009 workshop on Resiliency in high performance
Effective GIS Mobile Query System

FGIT '09 Proceedings of the 1st International Conference on Future Generation Information Technology
Satin: A high-level and efficient grid programming model

ACM Transactions on Programming Languages and Systems (TOPLAS)
The Cilk++ concurrency platform

The Journal of Supercomputing
Paper: Toward a better parallel performance metric

Parallel Computing
On the evaluation of gridification effort and runtime aspects of JGRIM applications

Future Generation Computer Systems
Parallel computing for data reduction

AIKED'10 Proceedings of the 9th WSEAS international conference on Artificial intelligence, knowledge engineering and data bases
The Cilkview scalability analyzer

Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
A work-efficient parallel breadth-first search algorithm (or how to cope with the nondeterminism of reducers)

Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
A parallel algorithm to compute data synopsis

WSEAS Transactions on Information Science and Applications
On the energy-performance tradeoff for parallel applications

EPEW'10 Proceedings of the 7th European performance engineering conference on Computer performance engineering
Energy-efficient scheduling for parallel real-time tasks based on level-packing

Proceedings of the 2011 ACM Symposium on Applied Computing
Improving the scalability of ILP-based multi-relational concept discovery system through parallelization

Knowledge-Based Systems
Improving speedup and response times by replicating parallel programs on a SNOW

JSSPP'04 Proceedings of the 10th international conference on Job Scheduling Strategies for Parallel Processing
DAG3: a tool for design and analysis of applications for multicore architectures

Proceedings of the 27th Annual ACM Symposium on Applied Computing
GPU-based roofs' solar potential estimation using LiDAR data

Computers & Geosciences
RACE: a scalable and elastic parallel system for discovering repeats in very long sequences

Proceedings of the VLDB Endowment
A performance-aware quality of service-driven scheduler for multicore processors

ACM SIGBED Review - Special Issue on the 3rd Embedded Operating System Workshop (EWiLi 2013)

Quantified Score

Hi-index	15.01

Visualization

Abstract

The tradeoff between speedup and efficiency that is inherent to a software system is investigated. The extent to which this tradeoff is determined by the average parallelism of the software system, as contrasted with other, more detailed, characterizations, is shown. The extent to which both speedup and efficiency can simultaneously be poor is bound: it is shown that for any software system and any number of processors, the sum of the average processor utilization (i.e. efficiency) and the attained fraction of the maximum possible speedup must exceed one. Bounds are given on speedup and efficiency, and on the incremental benefit and cost of allocating additional processors. An explicit formulation, as well as bounds, are given for the location of the knee of the execution time-efficiency profile, where the benefit per unit cost is maximized.