Input-driven dynamic execution prediction of streaming applications

Authors:
Farhana Aleen;Monirul Sharif;Santosh Pande
Affiliations:
Georgia Institute of Technology, Atlanta, GA, USA;Georgia Institute of Technology, Atlanta, GA, USA;Georgia Institute of Technology, Atlanta, GA, USA
Venue:
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Year:
2010

Citing 17
Cited 6

The program dependence graph and its use in optimization

ACM Transactions on Programming Languages and Systems (TOPLAS)
Iterative modulo scheduling: an algorithm for software pipelining loops

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
StreamIt: A Language for Streaming Applications

CC '02 Proceedings of the 11th International Conference on Compiler Construction
Cg: a system for programming graphics hardware in a C-like language

ACM SIGGRAPH 2003 Papers
Brook for GPUs: stream computing on graphics hardware

ACM SIGGRAPH 2004 Papers
Decoupled Software Pipelining with the Synchronization Array

Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Automatic Thread Extraction with Decoupled Software Pipelining

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Stream Programming on General-Purpose Processors

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
2D-Profiling: Detecting Input-Dependent Branches with a Single Input Data Set

Proceedings of the International Symposium on Code Generation and Optimization
Exploiting coarse-grained task, data, and pipeline parallelism in stream programs

Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Compilers: Principles, Techniques, and Tools (2nd Edition)

Compilers: Principles, Techniques, and Tools (2nd Edition)
Software-Pipelining on Multi-Core Architectures

PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
Execution-time Prediction for Dynamic Streaming Applications with Task-level Parallelism

DSD '07 Proceedings of the 10th Euromicro Conference on Digital System Design Architectures, Methods and Tools
A Practical Approach to Exploiting Coarse-Grained Pipeline Parallelism in C Programs

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Orchestrating the execution of stream programs on multicore platforms

Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Parametric throughput analysis of synchronous data flow graphs

Proceedings of the conference on Design, automation and test in Europe
Tupni: automatic reverse engineering of input formats

Proceedings of the 15th ACM conference on Computer and communications security

Partitioning streaming parallelism for multi-cores: a machine learning based approach

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
AARTS: low overhead online adaptive auto-tuning

Proceedings of the 1st International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era
Adaptive parallel approximate similarity search for responsive multimedia retrieval

Proceedings of the 20th ACM international conference on Information and knowledge management
Using machine learning to partition streaming programs

ACM Transactions on Architecture and Code Optimization (TACO)
Mantis: automatic performance prediction for smartphone applications

USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference
StreaMorph: a case for synthesizing energy-efficient adaptive programs using high-level abstractions

Proceedings of the Eleventh ACM International Conference on Embedded Software

Quantified Score

Hi-index	0.00

Visualization

Abstract

Streaming applications are promising targets for effectively utilizing multicores because of their inherent amenability to pipelined parallelism. While existing methods of orchestrating streaming programs on multicores have mostly been static, real-world applications show ample variations in execution time that may cause the achieved speedup and throughput to be sub-optimal. One of the principle challenges for moving towards dynamic orchestration has been the lack of approaches that can predict or accurately estimate upcoming dynamic variations in execution efficiently, well before they occur. In this paper, we propose an automated dynamic execution behavior prediction approach that can be used to efficiently estimate the time that will be spent in different pipeline stages for upcoming inputs without requiring program execution. This enables dynamic balancing or scheduling of execution to achieve better speedup. Our approach first uses dynamic taint analysis to automatically generates an input-based execution characterization of the streaming program, which identifies the key control points where variation in execution might occur with the associated input elements that cause these variations.We then automatically generate a light-weight emulator from the program using this characterization that can simulate the execution paths taken for new streaming inputs and provide an estimate of execution time that will be spent in processing these inputs, enabling prediction of possible dynamic variations. We present experimental evidence that our technique can accurately and efficiently estimate execution behaviors for several benchmarks. Our experiments show that dynamic orchestration using our predicted execution behavior can achieve considerably higher speedup than static orchestration.