Extracting task-level parallelism
ACM Transactions on Programming Languages and Systems (TOPLAS)
Introspective sorting and selection algorithms
Software—Practice & Experience
SystemC: a homogenous environment to test embedded systems
Proceedings of the ninth international symposium on Hardware/software codesign
Synthesis of hardware models in C with pointers and complex data structures
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - System Level Design
Automatically tuned linear algebra software
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Codesign-extended applications
Proceedings of the tenth international symposium on Hardware/software codesign
A performance analysis of the Berkeley UPC compiler
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
SPARK: A High-Lev l Synthesis Framework For Applying Parallelizing Compiler Transformations
VLSID '03 Proceedings of the 16th International Conference on VLSI Design
A quantitative analysis of the speedup factors of FPGAs over processors
FPGA '04 Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
Power Efficient Processor Architecture and The Cell Processor
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Pace--A Toolset for the Performance Prediction of Parallel and Distributed Systems
International Journal of High Performance Computing Applications
A code refinement methodology for performance-improved synthesis from C
Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design
Parallel Programmability and the Chapel Language
International Journal of High Performance Computing Applications
RAT: a methodology for predicting performance in application design migration to FPGAs
HPRCTA '07 Proceedings of the 1st international workshop on High-performance reconfigurable computing technology and applications: held in conjunction with SC07
PetaBricks: a language and compiler for algorithmic choice
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Improving programmability of heterogeneous many-core systems via explicit platform descriptions
Proceedings of the 4th International Workshop on Multicore Software Engineering
Comparing machine learning approaches for context-aware composition
SC'11 Proceedings of the 10th international conference on Software composition
RACECAR: a heuristic for automatic function specialization on multi-core heterogeneous systems
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
A compiler and runtime for heterogeneous computing
Proceedings of the 49th Annual Design Automation Conference
Optimized composition of performance-aware parallel components
Concurrency and Computation: Practice & Experience
Elastic computing: A portable optimization framework for hybrid computers
Parallel Computing
VForce: An environment for portable applications on high performance systems with accelerators
Journal of Parallel and Distributed Computing
Adaptation of legacy codes to context-aware composition using aspect-oriented programming
SC'12 Proceedings of the 11th international conference on Software Composition
The RACECAR heuristic for automatic function specialization on multi-core heterogeneous systems
Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systems
Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
High-level support for pipeline parallelism on many-core architectures
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
SAMEVED: A System Architecture for Managing and Establishing Virtual Elastic Datacenters
International Journal of Grid and High Performance Computing
Hi-index | 0.00 |
Over the past decade, system architectures have started on a clear trend towards increased parallelism and heterogeneity, often resulting in speedups of 10x to 100x. Despite numerous compiler and high-level synthesis studies, usage of such systems has largely been limited to device experts, due to significantly increased application design complexity. To reduce application design complexity, we introduce elastic computing - a framework that separates functionality from implementation details by enabling designers to use specialized functions, called elastic functions, which enable an optimization framework to explore thousands of possible implementations, even ones using different algorithms. Elastic functions allow designers to execute the same application code efficiently on potentially any architecture and for different runtime parameters such as input size, battery life, etc. In this paper, we present an initial elastic computing framework that transparently optimizes application code onto diverse systems, achieving significant speedups ranging from 1.3x to 46x on a hyper-threaded Xeon system with an FPGA accelerator, a 16-CPU Opteron system, and a quad-core Xeon system.