Parallel processing: the Cm* experience
Parallel processing: the Cm* experience
The influence of parallel decomposition strategies on the performance of multiprocessor systems
ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
PACS: a parallel microprocessor array for scientific calculations
ACM Transactions on Computer Systems (TOCS)
Interference in multiprocessor computer systems with interleaved memory
Communications of the ACM
Analysis of a multiprocessor system with a shared bus
ISCA '78 Proceedings of the 5th annual symposium on Computer architecture
Performance of parallel programs: model and analyses
Performance of parallel programs: model and analyses
Visualizing Performance Debugging
Computer
Implementation machine paradigm for parallel programming
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Benchmark workload generation and performance characterization of multiprocessors
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Performance prediction of parallel processing systems: the PAMELA methodology
ICS '93 Proceedings of the 7th international conference on Supercomputing
Predicting application behavior in large scale shared-memory multiprocessors
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
MAD Kernels: An Experimental Testbed to Study Multiprocessor Memory System Behavior
IEEE Transactions on Parallel and Distributed Systems
Language and Compiler Support for Adaptive Distributed Applications
OM '01 Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems
Parallel performance prediction using lost cycles analysis
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Interpreting the performance of HPF/Fortran 90D
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Compile-Time Performance Prediction of HPF/Fortran 90D
IEEE Parallel & Distributed Technology: Systems & Technology
Cost and Time-Cost Effectiveness of Multiprocessing
IEEE Transactions on Parallel and Distributed Systems
Scheduling DAG's for Asynchronous Multiprocessor Execution
IEEE Transactions on Parallel and Distributed Systems
Performance Prediction Methodology for Parallel Programs with MPI in NOW Environments
IWDC '02 Proceedings of the 4th International Workshop on Distributed Computing, Mobile and Wireless Computing
Gang scheduling for highly efficient, distributed multiprocessor systems
FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
Parallel program performance prediction using deterministic task graph analysis
ACM Transactions on Computer Systems (TOCS)
Run-time optimizations for replicated dataflows on heterogeneous environments
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Optimizing dataflow applications on heterogeneous environments
Cluster Computing
Hi-index | 14.98 |
A model for predicting multiprocessor performance on iterative algorithms is developed. Each iteration consists of some amount of access to global data and some amount of local processing. The iterations may be synchronous or asynchronous, and the processors may or may not incur waiting time, depending on the relationship between the access time and processing time. The effect on performance of the speed of the processor, memory, and the interconnection network is studied. The model also illustrates the significant impact on performance of decomposing an algorithm into parallel processes. The model's predictions are calibrated with experimental measurements.