Merge: a programming model for heterogeneous multi-core systems
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Harmony: an execution model and runtime for heterogeneous many core systems
HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
Predictive Runtime Code Scheduling for Heterogeneous Architectures
HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
Tuned and wildly asynchronous stencil kernels for hybrid CPU/GPU systems
Proceedings of the 23rd international conference on Supercomputing
Power-aware dynamic task scheduling for heterogeneous accelerated clusters
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Towards dense linear algebra for hybrid GPU accelerated manycore systems
Parallel Computing
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Multi-GPU and multi-CPU parallelization for interactive physics simulations
Euro-Par'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part II
Comparing Hardware Accelerators in Scientific Applications: A Case Study
IEEE Transactions on Parallel and Distributed Systems
Design and Performance Evaluation of Image Processing Algorithms on GPUs
IEEE Transactions on Parallel and Distributed Systems
StarPU: a unified platform for task scheduling on heterogeneous multicore architectures
Concurrency and Computation: Practice & Experience - Euro-Par 2009
A static task partitioning approach for heterogeneous systems using OpenCL
CC'11/ETAPS'11 Proceedings of the 20th international conference on Compiler construction: part of the joint European conferences on theory and practice of software
Efficient Parallel Graph Exploration on Multi-Core CPU and GPU
PACT '11 Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques
A yoke of oxen and a thousand chickens for heavy lifting graph processing
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Performance Gaps between OpenMP and OpenCL for Multi-core CPUs
ICPPW '12 Proceedings of the 2012 41st International Conference on Parallel Processing Workshops
An application-centric evaluation of OpenCL on multi-core CPUs
Parallel Computing
ad-heap: an Efficient Heap Data Structure for Asymmetric Multicore Processors
Proceedings of Workshop on General Purpose Processing Using GPUs
Hi-index | 0.00 |
Heterogeneous platforms integrating different processors like GPUs and multi-core CPUs become popular in high performance computing. While most applications are currently using the homogeneous parts of these platforms, we argue that there is a large class of applications that can benefit from their heterogeneity: massively parallel imbalanced applications. Such applications emerge, for example, from variable time step based numerical methods and simulations. In this paper, we present Glinda, a framework for accelerating imbalanced applications on heterogeneous computing platforms. Our framework is able to correctly detect the application workload characteristics, make choices based on the available parallel solutions and hardware configuration, and automatically obtain the optimal workload decomposition and distribution. Our experiments on parallelizing a heavily imbalanced acoustic ray tracing application show that Glinda improves application performance in multiple scenarios, achieving up to 12x speedup against manually configured parallel solutions.