Pervasive parallel computing: an historic opportunity for innovation in programming and architecture

Authors:
Andrew A. Chien
Affiliations:
Intel Corporation, Portland, OR
Venue:
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Year:
2007

Citing 0
Cited 2

Servo: a programming model for many-core computing

ACM SIGARCH Computer Architecture News
The future of microprocessors

Communications of the ACM

Quantified Score

Hi-index	0.02

Visualization

Abstract

Parallel programming has been the subject of deep research for decades -- and renowned in the software community as a difficult challenge to the degree that many companies have teams of parallelism and concurrency experts. Further, many ISV's explicitly design their software architectures so as to ensure that the majority of the development effort, including of course debug and test, can be done without consideration of parallelism. What makes parallelism so difficult, are the knotty and coupled problems of correctness, performance -- particularly data locality, and software modularity. In Terascale (manycore) chip-level multiprocessors, we are facing a pervasive and critical parallel programming challenge. Core counts on a single chip are expected to increase rapidly, progressing with Moore's law, and quad-core systems are already available today in mainstream volume client and server platforms. To continue the rapid performance scaling to which we have become accustomed, applications will need to exhibit ample parallelism (and increasing amounts of it) for successive generations of hardware. Further, because the move to multiple-core parallelism as the primary basis for performance improvement is pervasive, this requirement falls on a wide range of applications including traditional large-scale commercial and HPC server, desktop, laptop, and even those running on small mobile devices. That breadth has numerous implications for the types of solutions that are required. We will discuss some of the requirements for Terascale parallel programming solutions, and point out several potentially fruitful directions. A number of these solutions will build on mainstream programming approaches (objects, modularity, imperative), particularly introducing parallelism with modest disruption to both large-scale and local-scale program structure. However, there is an opportunity for radically different approaches to take hold in the mainstream (e.g. functional). On the hardware front, there are several reasons why the parallel programming problem for Terascale (manycore) systems is easier than previous generations of multiprocessors (and can be much easier). The basic hardware characteristics of chip-multiprocessors provide much greater opportunity for efficient coupling and coordination, and a tightly-coupled memory system, simplifying a wealth of sophisticated scheduling and sharing structures. Further, the diminishing performance returns for larger single cores releases innovation to support both parallel programming, and higher-level programming in general. This is a huge opportunity to pioneer new approaches and solutions that are radically better than those widely-used today. We will close with some speculation on the rate of progress of parallel programming into the mainstream software community and some implications of such proliferation.