Modeling and predicting performance of high performance computing applications on hardware accelerators

Authors:
Mitesh R. Meswani;Laura Carrington;Didem Unat;Allan Snavely;Scott Baden;Stephen Poole
Affiliations:
San Diego Supercomputer Center, La Jolla, CA, USA;San Diego Supercomputer Center, La Jolla, CA, USA;University of California at San Diego, San Diego, CA, USA;San Diego Supercomputer Center, La Jolla, CA, USA;University of California at San Diego, San Diego, CA, USA;Oak Ridge National Laboratory, Oak Ridge, TN, USA
Venue:
International Journal of High Performance Computing Applications
Year:
2013

Citing 16
Cited 0

Analysis of benchmark characteristics and benchmark performance prediction

ACM Transactions on Computer Systems (TOCS)
LogP: a practical model of parallel computation

Communications of the ACM
Semi-empirical multiprocessor performance predictions

Journal of Parallel and Distributed Computing
The SimpleScalar tool set, version 2.0

ACM SIGARCH Computer Architecture News
LogGP: incorporating long messages into the LogP model for parallel computation

Journal of Parallel and Distributed Computing
The Queue-Read Queue-Write PRAM Model: Accounting for Contention in Parallel Algorithms

SIAM Journal on Computing
RSIM: Simulating Shared-Memory Multiprocessors with ILP Processors

Computer
Simics: A Full System Simulation Platform

Computer
Measuring Cache and TLB Performance and Their Effect on Benchmark Runtimes

IEEE Transactions on Computers
Simulation of Computer Architectures: Simulators, Benchmarks, Methodologies, and Recommendations

IEEE Transactions on Computers
Performance Modeling and Prediction of Parallel and Distributed Computing Systems: A Survey of the State of the Art

CISIS '07 Proceedings of the First International Conference on Complex, Intelligent and Software Intensive Systems
Roofline: an insightful visual performance model for multicore architectures

Communications of the ACM - A Direct Path to Dependable Software
Instruction Set Innovations for the Convey HC-1 Computer

IEEE Micro
High-Performance Heterogeneous Computing with the Convey HC-1

Computing in Science and Engineering
The gem5 simulator

ACM SIGARCH Computer Architecture News
An exploration of performance attributes for symbolic modeling of emerging processing devices

HPCC'07 Proceedings of the Third international conference on High Performance Computing and Communications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Hybrid-core systems speedup applications by offloading certain compute operations that can run faster on hardware accelerators. However, such systems require significant programming and porting effort to gain a performance benefit from the accelerators. Therefore, prior to porting it is prudent to investigate the predicted performance benefit of accelerators for a given workload. To address this problem we present a performance-modeling framework that predicts the application performance rapidly and accurately for hybrid-core systems. We present predictions for two full-scale HPC applications-HYCOM and Milc. Our results for two accelerators (GPU and FPGA) show that gather/scatter and stream operations can speedup by as much as a factor of 15 and overall compute time of Milc and HYCOM improve by 3.4% and 20%, respectively. We also show that in order to benefit from the accelerators, 70% of the latency of data transfer time between the CPU and the accelerators needs to be overcome.