A framework for performance modeling and prediction
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Pin: building customized program analysis tools with dynamic instrumentation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Cross-Platform Performance Prediction of Parallel Applications Using Partial Execution
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Apex-Map: A Global Data Access Benchmark to Analyze HPC Systems and Parallel Programming Paradigms
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Quantifying Locality In The Memory Access Patterns of HPC Applications
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
A framework to develop symbolic performance models of parallel applications
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Hierarchical model validation of symbolic performance models of scientific kernels
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Performance modeling: understanding the past and predicting the future
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Diagnosis, Tuning, and Redesign for Multicore Performance: A Case Study of the Fast Multipole Method
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
An idiom-finding tool for increasing productivity of accelerators
Proceedings of the international conference on Supercomputing
International Journal of High Performance Computing Applications
Hi-index | 0.00 |
Vector, emerging (homogenous and heterogeneous) multi-core and a number of accelerator processing devices potentially offer an order of magnitude speedup for scientific applications that are capable of exploiting their SIMD execution units over microprocessor execution times. Nevertheless, identifying, mapping and achieving high performance for a diverse set of scientific algorithms is a challenging task, let alone the performance predictions and projections on these devices. The conventional performance modeling strategies are unable to capture the performance characteristics of complex processing systems and, therefore, fail to predict achievable runtime performance. Moreover, most efforts involved in developing a performance modeling strategy and subsequently a framework for unique and emerging processing devices is prohibitively expensive. In this study, we explore a minimum set of attributes that are necessary to capture the performance characteristics of scientific calculations on the Cray X1E multi-streaming, vector processor. We include a set of specialized performance attributes of the X1E system including the degrees of multi-streaming and vectorization within our symbolic modeling framework called Modeling Assertions (MA). Using our scheme, the performance prediction error rates for a scientific calculation are reduced from over 200% to less than 25%.