Quantitative system performance: computer system analysis using queueing network models
Quantitative system performance: computer system analysis using queueing network models
Performance Prediction and Calibration for a Class of Multiprocessors
IEEE Transactions on Computers
Performance debugging shared memory multiprocessor programs with MTOOL
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
MemSpy: analyzing memory system bottlenecks in programs
SIGMETRICS '92/PERFORMANCE '92 Proceedings of the 1992 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Optimally profiling and tracing programs
POPL '92 Proceedings of the 19th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Predicting conditional branch directions from previous runs of a program
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Exploiting the memory hierarchy in sequential and parallel sparse Cholesky factorization
Exploiting the memory hierarchy in sequential and parallel sparse Cholesky factorization
Parallel hierarchical N-body methods and their implications for multiprocessors
Parallel hierarchical N-body methods and their implications for multiprocessors
Modeling communication in parallel algorithms: a fruitful interaction between theory and systems?
SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
An approach to scalability study of shared memory parallel systems
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Analyzing the behavior and performance of parallel programs
Analyzing the behavior and performance of parallel programs
Performance Evaluation of Hierarchical Ring-Based Shared Memory Multiprocessors
IEEE Transactions on Computers
Predicting Performance of Parallel Computations
IEEE Transactions on Parallel and Distributed Systems
Performance of Synchronous Parallel Algorithms with Regular Structures
IEEE Transactions on Parallel and Distributed Systems
Memory Contention in Scalable Cache-Coherent Multiprocessors
Memory Contention in Scalable Cache-Coherent Multiprocessors
LoPC: modeling contention in parallel algorithms
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Predictive analysis of a wavefront application using LogGP
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Parallel program performance prediction using deterministic task graph analysis
ACM Transactions on Computer Systems (TOCS)
Hi-index | 0.00 |
In this paper we present an analytical-based framework for parallel program performance prediction. The main thrust of this work is to provide a means for treating realistic applications within a single unified framework. Our approach is based upon the specification of a set of non-linear equations which describe the application, processor configuration, network and memory operations. These equations are solved iteratively since the application execution rate depends on the communication latencies. The iterative solution technique is found to be efficient as it typically requires only few iterations to reach convergence. Our modeling methodology achieves a good balance between abstraction and accuracy. This is attained by accounting for both time and space dimensions of memory references, while maintaining a simple description of the workload. We demonstrate both the practicality and the accuracy of our approach by comparing predicted results with measurements taken on a commercial multiprocessor system. We found the model to be faithful in reflecting changes in processor speed, and changes in the number and placement of allocated processors.