A balanced programming model for emerging heterogeneous multicore systems

Authors:
Wei Liu;Brian Lewis;Xiaocheng Zhou;Hu Chen;Ying Gao;Shoumeng Yan;Sai Luo;Bratin Saha
Affiliations:
Intel Corporation;Intel Corporation;Intel Corporation;Intel Corporation;Intel Corporation;Intel Corporation;Intel Corporation;Intel Corporation
Venue:
HotPar'10 Proceedings of the 2nd USENIX conference on Hot topics in parallelism
Year:
2010

Citing 8
Cited 1

The Java memory model

Proceedings of the 32nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
GPGPU: general purpose computation on graphics hardware

ACM SIGGRAPH 2004 Course Notes
Synergistic Processing in Cell's Multicore Architecture

IEEE Micro
Sequoia: programming the memory hierarchy

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Sequoia: programming the memory hierarchy

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Larrabee: a many-core x86 architecture for visual computing

ACM SIGGRAPH 2008 papers
Programming model for a heterogeneous x86 platform

Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Parallel search on video cards

HotPar'09 Proceedings of the First USENIX conference on Hot topics in parallelism

PACUE: processor allocator considering user experience

Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

Computer systems are moving towards a heterogeneous architecture with a combination of one or more CPUs and one or more accelerator processors. Such heterogeneous systems pose a new challenge to the parallel programming community. Languages such as OpenCL and CUDA provide a program environment for such systems. However, they focus on data parallel programming where the majority of computation is carried out by the accelerators. Our view is that, in the future, accelerator processors will be tightly coupled with the CPUs, be available in different system architectures (e.g., integrated and discrete), and systems will be dynamically reconfigurable. In this paper we advocate a balanced programming model where computation is balanced between the CPU and its accelerators. This model supports sharing virtual memory between the CPU and the accelerator processors so the same data structures can be manipulated by both sides. It also supports task-parallel as well as data-parallel programming, fine-grained synchronization, thread scheduling, and load balancing. This model not only leverages the computational capability of CPUs, but also allows dynamic system reconfiguration, and supports different platform configurations. To help demonstrate the practicality of our programming model, we present performance results for a preliminary implementation on a computer system with an Intel® Core™ i7 processor and a discrete Larrabee processor. These results show that the model's most performance-critical part, its shared virtual memory implementation, simplifies programming without hurting performance.