Run-Time Parallelization and Scheduling of Loops
IEEE Transactions on Computers
The hierarchical task graph and its use in auto-scheduling
ICS '91 Proceedings of the 5th international conference on Supercomputing
A scalable method for run-time loop parallelization
International Journal of Parallel Programming
Dynamic feedback: an effective technique for adaptive computing
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Optimizing compilers for modern architectures: a dependence-based approach
Optimizing compilers for modern architectures: a dependence-based approach
NetBench: a benchmarking suite for network processors
Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
Doany: Not Just Another Parallel Loop
Proceedings of the 5th International Workshop on Languages and Compilers for Parallel Computing
Introduction to Stochastic Search and Optimization
Introduction to Stochastic Search and Optimization
ADAPT: Automated De-Coupled Adaptive Program Transformation
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Optimistic parallelism requires abstractions
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Parallel-stage decoupled software pipelining
Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
The PARSEC benchmark suite: characterization and architectural implications
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Mapping parallelism to multi-cores: a machine learning based approach
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Multicore diversity: a software developer's nightmare
ACM SIGOPS Operating Systems Review
PetaBricks: a language and compiler for algorithmic choice
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Adapting application execution in CMPs using helper threads
Journal of Parallel and Distributed Computing
The multikernel: a new OS architecture for scalable multicore systems
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
The Cilk++ concurrency platform
Proceedings of the 46th Annual Design Automation Conference
Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Efficient Scheduling of Nested Parallel Loops on Multi-Core Systems
ICPP '09 Proceedings of the 2009 International Conference on Parallel Processing
Lazy binary-splitting: a run-time adaptive work-stealing scheduler
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
ACM SIGMETRICS Performance Evaluation Review
Contention aware execution: online contention detection and response
Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
Composing parallel software efficiently with lithe
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Feedback-directed pipeline parallelism
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
The Paralax infrastructure: automatic parallelization with a helping hand
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Elastic executions from inelastic programs
Proceedings of the 6th International Symposium on Software Engineering for Adaptive and Self-Managing Systems
Commutative set: a language extension for implicit parallel programming
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Parallelism orchestration using DoPE: the degree of parallelism executive
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Run-time automatic performance tuning for multicore applications
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Online Adaptive Code Generation and Tuning
IPDPS '11 Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium
Encyclopedia of Parallel Computing
Encyclopedia of Parallel Computing
From sequential programming to flexible parallel execution
Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systems
Holistic run-time parallelism management for time and energy efficiency
Proceedings of the 27th international ACM conference on International conference on supercomputing
Sambamba: runtime adaptive parallel execution
Proceedings of the 3rd International Workshop on Adaptive Self-Tuning Computing Systems
Hi-index | 0.00 |
Workload, platform, and available resources constitute a parallel program's execution environment. Most parallelization efforts statically target an anticipated range of environments, but performance generally degrades outside that range. Existing approaches address this problem with dynamic tuning but do not optimize a multiprogrammed system holistically. Further, they either require manual programming effort or are limited to array-based data-parallel programs. This paper presents Parcae, a generally applicable automatic system for platform-wide dynamic tuning. Parcae includes (i) the Nona compiler, which creates flexible parallel programs whose tasks can be efficiently reconfigured during execution; (ii) the Decima monitor, which measures resource availability and system performance to detect change in the environment; and (iii) the Morta executor, which cuts short the life of executing tasks, replacing them with other functionally equivalent tasks better suited to the current environment. Parallel programs made flexible by Parcae outperform original parallel implementations in many interesting scenarios.