Comparison of hardware and software cache coherence schemes
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Shared memory computing on clusters with symmetric multiprocessors and system area networks
ACM Transactions on Computer Systems (TOCS)
GPGPU: general purpose computation on graphics hardware
ACM SIGGRAPH 2004 Course Notes
X10: concurrent programming for modern architectures
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
EXOCHI: architecture and programming environment for a heterogeneous multi-core multithreaded system
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Enabling scalability and performance in a large scale CMP environment
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Optimization principles and application performance evaluation of a multithreaded GPU using CUDA
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Larrabee: a many-core x86 architecture for visual computing
ACM SIGGRAPH 2008 papers
CUBA: an architecture for efficient CPU/co-processor data communication
Proceedings of the 22nd annual international conference on Supercomputing
Foundations of the C++ concurrency memory model
Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Compiling Python to a hybrid execution environment
Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units
Proceedings of the 24th ACM International Conference on Supercomputing
Data-aware scheduling of legacy kernels on heterogeneous platforms with distributed memory
Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Cohesion: a hybrid memory model for accelerators
Proceedings of the 37th annual international symposium on Computer architecture
An OpenCL framework for heterogeneous multicores with local memory
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
A balanced programming model for emerging heterogeneous multicore systems
HotPar'10 Proceedings of the 2nd USENIX conference on Hot topics in parallelism
Bothnia: a dual-personality extension to the Intel integrated graphics driver
ACM SIGOPS Operating Systems Review
Optimizing a shared virtual memory system for a heterogeneous CPU-accelerator platform
ACM SIGOPS Operating Systems Review
Reflex: using low-power processors in smartphones without knowing them
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Design space exploration of memory model for heterogeneous computing
Proceedings of the 2012 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness
Compiler and runtime support for enabling reduction computations on heterogeneous systems
Concurrency and Computation: Practice & Experience
A virtual memory based runtime to support multi-tenancy in clusters with GPUs
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
A file I/O system for many-core based clusters
Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers
Harmony: collection and analysis of parallel block vectors
Proceedings of the 39th Annual International Symposium on Computer Architecture
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Automatic generation of software pipelines for heterogeneous parallel systems
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Protozoa: adaptive granularity cache coherence
Proceedings of the 40th Annual International Symposium on Computer Architecture
RSVM: a region-based software virtual memory for GPU
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
ACM Transactions on Programming Languages and Systems (TOPLAS)
Journal of Parallel and Distributed Computing
Boosting CUDA Applications with CPU---GPU Hybrid Computing
International Journal of Parallel Programming
Hi-index | 0.00 |
The client computing platform is moving towards a heterogeneous architecture consisting of a combination of cores focused on scalar performance, and a set of throughput-oriented cores. The throughput oriented cores (e.g. a GPU) may be connected over both coherent and non-coherent interconnects, and have different ISAs. This paper describes a programming model for such heterogeneous platforms. We discuss the language constructs, runtime implementation, and the memory model for such a programming environment. We implemented this programming environment in a x86 heterogeneous platform simulator. We ported a number of workloads to our programming environment, and present the performance of our programming environment on these workloads.