ACM Transactions on Computer Systems (TOCS)
On the effective bandwidth of interleaved memories in vector processor systems
IEEE Transactions on Computers
Performance and Reliability Analysis Using Directed Acyclic Graphs
IEEE Transactions on Software Engineering
A framework for determining useful parallelism
ICS '88 Proceedings of the 2nd international conference on Supercomputing
Building analytical models into an interactive performance prediction tool
Proceedings of the 1989 ACM/IEEE conference on Supercomputing
A bridging model for parallel computation
Communications of the ACM
A static performance estimator to guide data partitioning decisions
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
LogP: towards a realistic model of parallel computation
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Performance prediction of parallel processing systems: the PAMELA methodology
ICS '93 Proceedings of the 7th international conference on Supercomputing
The influence of random delays on parallel execution times
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Precise compile-time performance prediction for superscalar-based computers
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Compiling performance models from parallel programs
ICS '94 Proceedings of the 8th international conference on Supercomputing
Journal of Parallel and Distributed Computing - Special issue on scalability of parallel algorithms and architectures
Analyzing the behavior and performance of parallel programs
Analyzing the behavior and performance of parallel programs
Semi-empirical multiprocessor performance predictions
Journal of Parallel and Distributed Computing
Balanced job bound analysis of queueing networks
Communications of the ACM
Communications of the ACM
The distributed ASCI Supercomputer project
ACM SIGOPS Operating Systems Review
Partitioning and Scheduling Parallel Programs for Multiprocessors
Partitioning and Scheduling Parallel Programs for Multiprocessors
Computer Performance Modeling Handbook
Computer Performance Modeling Handbook
Predicting Performance of Parallel Computations
IEEE Transactions on Parallel and Distributed Systems
Lower and Upper Bounds on Time for Multiprocessor Optimal Schedules
IEEE Transactions on Parallel and Distributed Systems
Performance Evaluation of Computer and Communication Systems, Joint Tutorial Papers of Performance '93 and Sigmetrics '93
Multivariate statistical techniques for parallel performance prediction
HICSS '95 Proceedings of the 28th Hawaii International Conference on System Sciences
A probabilistic approach to parallel system performance modelling
HICSS '95 Proceedings of the 28th Hawaii International Conference on System Sciences
Multivariate statistical techniques for parallel performance prediction
HICSS '95 Proceedings of the 28th Hawaii International Conference on System Sciences
Integrated Compilation and Scalability Analysis for Parallel Systems
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Performance Modeling of Distributed Hybrid Architectures
IEEE Transactions on Parallel and Distributed Systems
Parallel program performance prediction using deterministic task graph analysis
ACM Transactions on Computer Systems (TOCS)
Schedulability analysis of applications with stochastic task execution times
ACM Transactions on Embedded Computing Systems (TECS)
Cross-Platform Performance Prediction of Parallel Applications Using Partial Execution
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Low-Cost Static Performance Prediction of Parallel Stochastic Task Compositions
IEEE Transactions on Parallel and Distributed Systems
Contention-sensitive static performance prediction for parallel distributed applications
Performance Evaluation
$P$^$3$$T+$: A performance estimator for distributed and parallel programs
Scientific Programming
ACM Transactions on Embedded Computing Systems (TECS)
Fast performance prediction of master-slave programs by partial task execution
SEPADS'05 Proceedings of the 4th WSEAS International Conference on Software Engineering, Parallel & Distributed Systems
Parallel execution time prediction of the multitask parallel programs
Performance Evaluation
Performance modeling and analysis of correlated parallel computations
Parallel Computing
Exhaustion dominated performance: a first attempt
Proceedings of the 2009 ACM symposium on Applied Computing
SP@CE: an SP-based programming model for consumer electronics streaming applications
LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
An idiom-finding tool for increasing productivity of accelerators
Proceedings of the international conference on Supercomputing
PAM-SoC: a toolchain for predicting MPSoC performance
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Hi-index | 0.00 |
Performance prediction is an important engineering tool that provides valuable feedback on design choices in program synthesis and machine architecture development. We present an analytic performance modeling approach aimed to minimize prediction cost, while providing a prediction accuracy that is sufficient to enable major code and data mapping decisions. Our approach is based on a performance simulation language called Pamela. Apart from simulation, Pamela features a symbolic analysis technique that enables Pamela models to be compiled into symbolic performance models that trade prediction accuracy for the lowest possible solution cost. We demonstrate our approach through a large number of theoretical and practical modeling case studies, including six parallel programs and two distributed-memory machines. The average prediction error of our approach is less than 10 percent, while the average worst-case error is limited to 50 percent. It is shown that this accuracy is sufficient to correctly select the best coding or partitioning strategy. For programs expressed in a high-level, structured programming model, such as data-parallel programs, symbolic performance modeling can be entirely automated. We report on experiments with a Pamela model generator built within a data-parallel compiler for distributed-memory machines. Our results show that with negligible program annotation, symbolic performance models are automatically compiled in seconds, while their solution cost is in the order of milliseconds.