Starchart: hardware and software optimization using recursive partitioning regression trees

Authors:
Wenhao Jia;Kelly A. Shaw;Margaret Martonosi
Affiliations:
Princeton University, Princeton, NJ, USA;University of Richmond, Richmond, VA, USA;Princeton University, Princeton, NJ, USA
Venue:
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Year:
2013

Citing 15
Cited 0

Compiler optimization-space exploration

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Accurate and efficient regression modeling for microarchitectural performance and power prediction

Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Program optimization space pruning for a multithreaded gpu

Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness

Proceedings of the 36th annual international symposium on Computer architecture
A Note on Auto-tuning GEMM for GPUs

ICCS '09 Proceedings of the 9th International Conference on Computational Science: Part I
Rodinia: A benchmark suite for heterogeneous computing

IISWC '09 Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC)
Model-driven autotuning of sparse matrix-vector multiply on GPUs

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
An integrated GPU power and performance model

Proceedings of the 37th annual international symposium on Computer architecture
A case for machine learning to optimize multicore performance

HotPar'09 Proceedings of the First USENIX conference on Hot topics in parallelism
Statistical power modeling of GPU kernels using performance counters

GREENCOMP '10 Proceedings of the International Conference on Green Computing
Auto-tuning of fast fourier transform on graphics processors

Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Tree structured analysis on GPU power study

ICCD '11 Proceedings of the 2011 IEEE 29th International Conference on Computer Design
A performance analysis framework for identifying potential benefits in GPGPU applications

Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
Stargazer: Automated regression-based GPU design space exploration

ISPASS '12 Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software

Quantified Score

Hi-index	0.00

Visualization

Abstract

Graphics processing units (GPUs) are in increasingly wide use, but significant hurdles lie in selecting the appropriate algorithms, runtime parameter settings, and hardware configurations to achieve power and performance goals with them. Exploring hardware and software choices requires time-consuming simulations or extensive real-system measurements. While some auto-tuning support has been proposed, it is often narrow in scope and heuristic in operation. This paper proposes and evaluates a statistical analysis technique, Starchart, that partitions the GPU hardware/software tuning space by automatically discerning important inflection points in design parameter values. Unlike prior methods, Starchart can identify the best parameter choices within different regions of the space. Our tool is efficient--evaluating at most 0.3% of the tuning space, and often much less--and is robust enough to analyze highly variable real-system measurements, not just simulation. In one case study, we use it to automatically find platform-specific parameter settings that are 6.3X faster (for AMD) and 1.3X faster (for NVIDIA) than a single general setting. We also show how power-optimized parameter settings can save 47 W (26% of total GPU power) with little performance loss. Overall, Starchart can serve as a foundation for a range of GPU compiler optimizations, auto-tuners, and programmer tools. Furthermore, because Starchart does not rely on specific GPU features, we expect it to be useful for broader CPU/GPU studies as well.