Heuristic Algorithms for Scheduling Independent Tasks on Nonidentical Processors
Journal of the ACM (JACM)
A Dynamic Matching and Scheduling Algorithm for Heterogeneous Computing Systems
HCW '98 Proceedings of the Seventh Heterogeneous Computing Workshop
HCW '99 Proceedings of the Eighth Heterogeneous Computing Workshop
Dynamic Mapping in a Heterogeneous Environment with Tasks Having Priorities and Multiple Deadlines
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Characterization and Enhancement of Dynamic Mapping Heuristics for Heterogeneous Systems
ICPP '00 Proceedings of the 2000 International Workshop on Parallel Processing
In Defense of One-Vs-All Classification
The Journal of Machine Learning Research
Brook for GPUs: stream computing on graphics hardware
ACM SIGGRAPH 2004 Papers
Heterogeneous Chip Multiprocessors
Computer
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
Optimization principles and application performance evaluation of a multithreaded GPU using CUDA
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Merge: a programming model for heterogeneous multi-core systems
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Harmony: an execution model and runtime for heterogeneous many core systems
HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
Predictive Runtime Code Scheduling for Heterogeneous Architectures
HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
Tuned and wildly asynchronous stencil kernels for hybrid CPU/GPU systems
Proceedings of the 23rd international conference on Supercomputing
StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Implementing the PGI Accelerator model
Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units
The Scalable Heterogeneous Computing (SHOC) benchmark suite
Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units
Proceedings of the 24th ACM International Conference on Supercomputing
Automatic calibration of performance models on heterogeneous multicore architectures
Euro-Par'09 Proceedings of the 2009 international conference on Parallel processing
Enabling task-level scheduling on heterogeneous platforms
Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units
A compiler-assisted runtime-prefetching scheme for heterogeneous platforms
IWOMP'12 Proceedings of the 8th international conference on OpenMP in a Heterogeneous World
Dynamic thread mapping based on machine learning for transactional memory applications
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Symbolic testing of OpenCL code
HVC'11 Proceedings of the 7th international Haifa Verification conference on Hardware and Software: verification and testing
CAP: co-scheduling based on asymptotic profiling in CPU+GPU hybrid systems
Proceedings of the 2013 International Workshop on Programming Models and Applications for Multicores and Manycores
Prius: a runtime for hybrid computing
Proceedings of the First International Workshop on Code OptimiSation for MultI and many Cores
An automatic input-sensitive approach for heterogeneous task partitioning
Proceedings of the 27th international ACM conference on International conference on supercomputing
LibWater: heterogeneous distributed computing made easy
Proceedings of the 27th international ACM conference on International conference on supercomputing
Glinda: a framework for accelerating imbalanced applications on heterogeneous platforms
Proceedings of the ACM International Conference on Computing Frontiers
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
ACM SIGOPS 24th Symposium on Operating Systems Principles
Dandelion: a compiler and runtime for heterogeneous systems
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Fluidic Kernels: Cooperative Execution of OpenCL Programs on Multiple Heterogeneous Devices
Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
CPU+GPU scheduling with asymptotic profiling
Parallel Computing
Hi-index | 0.00 |
Heterogeneous multi-core platforms are increasingly prevalent due to their perceived superior performance over homogeneous systems. The best performance, however, can only be achieved if tasks are accurately mapped to the right processors. OpenCL programs can be partitioned to take advantage of all the available processors in a system. However, finding the best partitioning for any heterogeneous system is difficult and depends on the hardware and software implementation. We propose a portable partitioning scheme for OpenCL programs on heterogeneous CPU-GPU systems. We develop a purely static approach based on predictive modelling and program features. When evaluated over a suite of 47 benchmarks, our model achieves a speedup of 1.57 over a state-of-the-art dynamic run-time approach, a speedup of 3.02 over a purely multi-core approach and 1.55 over the performance achieved by using just the GPU.