ALTER: exploiting breakable dependences for parallelization

Authors:
Abhishek Udupa;Kaushik Rajan;William Thies
Affiliations:
University of Pennsylvania, Philadelphia, PA, USA;Microsoft Research India, Bangalore, India;Microsoft Research India, Bangalore, India
Venue:
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Year:
2011

Citing 38
Cited 6

Array-data flow analysis and its use in array privatization

POPL '93 Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Transactional memory: architectural support for lock-free data structures

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
The privatizing DOALL test: a run-time technique for DOALL loop identification and array privatization

ICS '94 Proceedings of the 8th international conference on Supercomputing
The LRPD test: speculative run-time parallelization of loops with privatization and reduction parallelization

PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Multiscalar processors

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Olden: parallelizing programs with dynamic data structures on distributed-memory machines

Olden: parallelizing programs with dynamic data structures on distributed-memory machines
Commutativity analysis: a new analysis technique for parallelizing compilers

ACM Transactions on Programming Languages and Systems (TOPLAS)
Constraint-based array dependence analysis

ACM Transactions on Programming Languages and Systems (TOPLAS)
Evaluation of predicated array data-flow analysis for automatic parallelization

Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
A Chip-Multiprocessor Architecture with Speculative Multithreading

IEEE Transactions on Computers
A Unified Approach to Path Problems

Journal of the ACM (JACM)
Undecidability of context-sensitive data-dependence analysis

ACM Transactions on Programming Languages and Systems (TOPLAS)
Hoard: a scalable memory allocator for multithreaded applications

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Optimizing compilers for modern architectures: a dependence-based approach

Optimizing compilers for modern architectures: a dependence-based approach
Scheduling and Automatic Parallelization

Scheduling and Automatic Parallelization
Maximizing Multiprocessor Performance with the SUIF Compiler

Computer
Interprocedural dependence analysis and parallelization

ACM SIGPLAN Notices - Best of PLDI 1979-1999
Transactional Memory (Synthesis Lectures on Computer Architecture)

Transactional Memory (Synthesis Lectures on Computer Architecture)
Optimistic parallelism requires abstractions

Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Software behavior oriented parallelization

Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Implicitly parallel programming models for thousand-core microprocessors

Proceedings of the 44th annual Design Automation Conference
Speculative Decoupled Software Pipelining

PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
Revisiting the Sequential Programming Model for Multi-Core

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
A Practical Approach to Exploiting Coarse-Grained Pipeline Parallelism in C Programs

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
On the correctness of transactional memory

Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Optimistic parallelism benefits from data partitioning

Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Spice: speculative parallel iteration chunk execution

Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Commutativity analysis for software parallelization: letting program transformations see the big picture

Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Copy or Discard execution model for speculative parallelization on multicores

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Parallelizing sequential applications on commodity hardware using a low-cost software transactional memory

Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Towards a holistic approach to auto-parallelization: integrating profile-driven parallelism detection and machine-learning based mapping

Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
A view of the parallel computing landscape

Communications of the ACM - A View of Parallel Computing
Grace: safe multithreaded programming for C/C++

Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications
STAPL: an adaptive, generic parallel C++ library

LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
Supporting speculative parallelization in the presence of dynamic data structures

PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
The Paralax infrastructure: automatic parallelization with a helping hand

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Concurrent programming with revisions and isolation types

Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Auto-tuning stencil codes for cache-based multicore platforms

Auto-tuning stencil codes for cache-based multicore platforms

JANUS: exploiting parallelism via hindsight

Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Panacea: towards holistic optimization of MapReduce applications

Proceedings of the Tenth International Symposium on Code Generation and Optimization
From sequential programming to flexible parallel execution

Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systems
Dancing with uncertainty

Proceedings of the 2012 ACM workshop on Relaxing synchronization for multicore and manycore scalability
Parallelizing Sequential Programs with Statistical Accuracy Tests

ACM Transactions on Embedded Computing Systems (TECS) - Special Section on Probabilistic Embedded Computing
Turning nondeterminism into parallelism

Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

For decades, compilers have relied on dependence analysis to determine the legality of their transformations. While this conservative approach has enabled many robust optimizations, when it comes to parallelization there are many opportunities that can only be exploited by changing or re-ordering the dependences in the program. This paper presents Alter: a system for identifying and enforcing parallelism that violates certain dependences while preserving overall program functionality. Based on programmer annotations, Alter exploits new parallelism in loops by reordering iterations or allowing stale reads. Alter can also infer which annotations are likely to benefit the program by using a test-driven framework. Our evaluation of Alter demonstrates that it uncovers parallelism that is beyond the reach of existing static and dynamic tools. Across a selection of 12 performance-intensive loops, 9 of which have loop-carried dependences, Alter obtains an average speedup of 2.0x on 4 cores.