The Manchester prototype dataflow computer
Communications of the ACM - Special section on computer architecture
A Survey of Parallel Machine Organization and Programming
ACM Computing Surveys (CSUR)
The Operational Analysis of Queueing Network Models
ACM Computing Surveys (CSUR)
On the Execution of Programs by Many Processors
Performance '83 Proceedings of the 9th International Symposium on Computer Performance Modelling, Measurement and Evaluation
Modelling and analysis of distributed software systems
SOSP '79 Proceedings of the seventh ACM symposium on Operating systems principles
Optimal on-line load balancing
SPAA '89 Proceedings of the first annual ACM symposium on Parallel algorithms and architectures
Analysis of computation-communication issues in dynamic dataflow architectures
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Characterizations of parallelism in applications and their use in scheduling
SIGMETRICS '89 Proceedings of the 1989 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
A performance evaluation of a general parallel processing model
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Processor scheduling in shared memory multiprocessors
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Indeterminate behavior with determinate semantics in parallel programs
FPCA '89 Proceedings of the fourth international conference on Functional programming languages and computer architecture
Scalability of parallel machines
Communications of the ACM
Processor-pool-based scheduling for large-scale NUMA multiprocessors
SIGMETRICS '91 Proceedings of the 1991 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Analysis of scalability of parallel algorithms and architectures: a survey
ICS '91 Proceedings of the 5th international conference on Supercomputing
Cost-performance analysis of heterogeneity in supercomputer architectures
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Another view on parallel speedup
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
The Processor Working Set and its Use in Scheduling Multiprocessor Systems
IEEE Transactions on Software Engineering
ACM SIGARCH Computer Architecture News
On Parallel Processing Systems: Amdahl's Law Generalized and Some Results on Optimal Design
IEEE Transactions on Software Engineering
Further results using the overhead model for parallel systems
IBM Journal of Research and Development
Willow: a scalable shared memory multiprocessor
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Bounds on the speedup and efficiency of partial synchronization in parallel processing systems
Journal of the ACM (JACM)
A Hierarchical Task Queue Organization for Shared-Memory Multiprocessor Systems
IEEE Transactions on Parallel and Distributed Systems
Performance and Scalability of Preconditioned Conjugate Gradient Methods on Parallel Computers
IEEE Transactions on Parallel and Distributed Systems
Coordinated allocation of memory and processors in multiprocessors
Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Guaranteeing Good Memory Bounds for Parallel Programs
IEEE Transactions on Software Engineering
Processor Saving Scheduling Policies for Multiprocessor Systems
IEEE Transactions on Computers
Preemptive scheduling of parallel jobs on multiprocessors
Proceedings of the seventh annual ACM-SIAM symposium on Discrete algorithms
Workload Execution Strategies and Parallel Speedup on Clustered Computers
IEEE Transactions on Computers
Scheduling multithreaded computations by work stealing
Journal of the ACM (JACM)
Space Efficient Execution of Deterministic Parallel Programs
IEEE Transactions on Software Engineering
Scal-Tool: pinpointing and quantifying scalability bottlenecks in DSM multiprocessors
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Characteristics of scalability and their impact on performance
Proceedings of the 2nd international workshop on Software and performance
An operational semantics for parallel lazy evaluation
ICFP '00 Proceedings of the fifth ACM SIGPLAN international conference on Functional programming
High-performance computer architecture and algorithm simulator
Journal on Educational Resources in Computing (JERIC)
Models of Parallel Applications with Large Computation and I/O Requirements
IEEE Transactions on Software Engineering
Integrated Performance Models for SPMD Applications and MIMD Architectures
IEEE Transactions on Parallel and Distributed Systems
A parallel algorithm for Lagrange interpolation on the star graph
Journal of Parallel and Distributed Computing
A parallel workload model and its implications for processor allocation
Cluster Computing
Paging tradeoffs in distributed-shared-memory multiprocessors
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Parallelising large irregular programs: an experience with Naira
Information Sciences—Informatics and Computer Science: An International Journal - Special issue: Software engineering: Systems and tools
Modeling Speedup (n) Greater than n
IEEE Transactions on Parallel and Distributed Systems
The Effect of Scheduling Discipline on Spin Overhead in Shared Memory Parallel Systems
IEEE Transactions on Parallel and Distributed Systems
Cost and Time-Cost Effectiveness of Multiprocessing
IEEE Transactions on Parallel and Distributed Systems
The Scalability of FFT on Parallel Computers
IEEE Transactions on Parallel and Distributed Systems
Lower and Upper Bounds on Time for Multiprocessor Optimal Schedules
IEEE Transactions on Parallel and Distributed Systems
Integrated Performance Models for SPMD Applications and MIMD Architectures
IEEE Transactions on Parallel and Distributed Systems
Using moldability to improve the performance of supercomputer jobs
Journal of Parallel and Distributed Computing
When the Herd Is Smart: Aggregate Behavior in the Selection of Job Request
IEEE Transactions on Parallel and Distributed Systems
Adaptive Scheduling for Master-Worker Applications on the Computational Grid
GRID '00 Proceedings of the First IEEE/ACM International Workshop on Grid Computing
Parallel Models and Job Characterization for System Scheduling
ICCS '01 Proceedings of the International Conference on Computational Science-Part II
Improving Processor Allocation through Run-Time Measured Efficiency
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
A Model for Moldable Supercomputer Jobs
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
A Dynamic Periodicity Detector: Application to Speedup Computation
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Compiler Synthesis of Task Graphs for Parallel Program Performance Prediction
LCPC '00 Proceedings of the 13th International Workshop on Languages and Compilers for Parallel Computing-Revised Papers
A Tool to Schedule Parallel Applications on Multiprocessors: The NANOS CPU MANAGER
IPDPS '00/JSSPP '00 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Towards an Operational Semantics for a Parallel Non-Strict Functional Language
IFL '98 Selected Papers from the 10th International Workshop on 10th International Workshop
Parallel Job Scheduling: A Performance Perspective
Performance Evaluation: Origins and Directions
Parallel ray tracing on a chip
Practical parallel rendering
Task scheduling with locality consideration for a clustered parallel FL reduction system
PAS '95 Proceedings of the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis
Supercompilers for massively parallel architectures
PAS '95 Proceedings of the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis
On-line scheduling of scalable real-time tasks on multiprocessor systems
Journal of Parallel and Distributed Computing
Parallel program performance prediction using deterministic task graph analysis
ACM Transactions on Computer Systems (TOCS)
Parallel Polynomial Root Extraction on A Ring of Processors
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 15 - Volume 16
Parallel implementation of a transportation network model
Journal of Parallel and Distributed Computing
Performance-Driven Processor Allocation
IEEE Transactions on Parallel and Distributed Systems
International Journal of High Performance Computing Applications
Adaptive scheduling with parallelism feedback
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Adaptive work stealing with parallelism feedback
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Self-adaptive applications on the grid
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
$P$^$3$$T+$: A performance estimator for distributed and parallel programs
Scientific Programming
Performance-driven processor allocation
OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
Speedup and scalability analysis of Master--Slave applications on large heterogeneous clusters
Journal of Parallel and Distributed Computing
Adaptive work-stealing with parallelism feedback
ACM Transactions on Computer Systems (TOCS)
Proceedings of the 2009 workshop on Resiliency in high performance
Effective GIS Mobile Query System
FGIT '09 Proceedings of the 1st International Conference on Future Generation Information Technology
Satin: A high-level and efficient grid programming model
ACM Transactions on Programming Languages and Systems (TOPLAS)
The Cilk++ concurrency platform
The Journal of Supercomputing
Paper: Toward a better parallel performance metric
Parallel Computing
On the evaluation of gridification effort and runtime aspects of JGRIM applications
Future Generation Computer Systems
Parallel computing for data reduction
AIKED'10 Proceedings of the 9th WSEAS international conference on Artificial intelligence, knowledge engineering and data bases
The Cilkview scalability analyzer
Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
A parallel algorithm to compute data synopsis
WSEAS Transactions on Information Science and Applications
On the energy-performance tradeoff for parallel applications
EPEW'10 Proceedings of the 7th European performance engineering conference on Computer performance engineering
Energy-efficient scheduling for parallel real-time tasks based on level-packing
Proceedings of the 2011 ACM Symposium on Applied Computing
Improving speedup and response times by replicating parallel programs on a SNOW
JSSPP'04 Proceedings of the 10th international conference on Job Scheduling Strategies for Parallel Processing
DAG3: a tool for design and analysis of applications for multicore architectures
Proceedings of the 27th Annual ACM Symposium on Applied Computing
GPU-based roofs' solar potential estimation using LiDAR data
Computers & Geosciences
RACE: a scalable and elastic parallel system for discovering repeats in very long sequences
Proceedings of the VLDB Endowment
A performance-aware quality of service-driven scheduler for multicore processors
ACM SIGBED Review - Special Issue on the 3rd Embedded Operating System Workshop (EWiLi 2013)
Hi-index | 15.01 |
The tradeoff between speedup and efficiency that is inherent to a software system is investigated. The extent to which this tradeoff is determined by the average parallelism of the software system, as contrasted with other, more detailed, characterizations, is shown. The extent to which both speedup and efficiency can simultaneously be poor is bound: it is shown that for any software system and any number of processors, the sum of the average processor utilization (i.e. efficiency) and the attained fraction of the maximum possible speedup must exceed one. Bounds are given on speedup and efficiency, and on the incremental benefit and cost of allocating additional processors. An explicit formulation, as well as bounds, are given for the location of the knee of the execution time-efficiency profile, where the benefit per unit cost is maximized.