General data structure expansion for multi-threading

Authors:
Hongtao Yu;Hou-Jen Ko;Zhiyuan Li
Affiliations:
Purdue University, West Lafayette, IN, USA;Purdue University, West Lafayette, IN, USA;Purdue University, West Lafayette, IN, USA
Venue:
Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation
Year:
2013

Citing 34
Cited 0

Advanced compiler optimizations for supercomputers

Communications of the ACM - Special issue on parallelism
Array expansion

ICS '88 Proceedings of the 2nd international conference on Supercomputing
Array privatization for parallel execution of loops

ICS '92 Proceedings of the 6th international conference on Supercomputing
Array-data flow analysis and its use in array privatization

POPL '93 Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
The privatizing DOALL test: a run-time technique for DOALL loop identification and array privatization

ICS '94 Proceedings of the 8th international conference on Supercomputing
The LRPD test: speculative run-time parallelization of loops with privatization and reduction parallelization

PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
A scalable method for run-time loop parallelization

International Journal of Parallel Programming
Detecting coarse-grain parallelism using an interprocedural parallelizing compiler

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Experience with efficient array data flow analysis for array privatization

PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Hybrid analysis: static & dynamic memory reference analysis

ICS '02 Proceedings of the 16th international conference on Supercomputing
OpenMP: An Industry-Standard API for Shared-Memory Programming

IEEE Computational Science & Engineering
On Privatization of Variables for Data-Parallel Execution

IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
The R-LRPD Test: Speculative Parallelization of Partially Parallel Loops

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Experience in the Automatic Parallelization of Four Perfect-Benchmark Programs

Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
Automatic Array Privatization

Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
MiBench: A free, commercially representative embedded benchmark suite

WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Software behavior oriented parallelization

Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
A Practical Approach to Exploiting Coarse-Grained Pipeline Parallelism in C Programs

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Copy or Discard execution model for speculative parallelization on multicores

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Parallelizing sequential applications on commodity hardware using a low-cost software transactional memory

Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Towards a holistic approach to auto-parallelization: integrating profile-driven parallelism detection and machine-learning based mapping

Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
MediaBench II video: Expediting the next generation of video systems research

Microprocessors & Microsystems
Alchemist: A Transparent Dependence Distance Profiling Infrastructure

Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
Fast Track: A Software System for Speculative Program Optimization

Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
Supporting speculative parallelization in the presence of dynamic data structures

PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Speculative parallelization using state separation and multiple value prediction

Proceedings of the 2010 international symposium on Memory management
The Paralax infrastructure: automatic parallelization with a helping hand

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
SD3: A Scalable Approach to Dynamic Data-Dependence Profiling

MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
SpiceC: scalable parallelism via implicit copying and explicit commit

Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Commutative set: a language extension for implicit parallel programming

Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Speculative separation for privatization and reductions

Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Effective parallelization of loops in the presence of I/O operations

Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Fast loop-level data dependence profiling

Proceedings of the 26th ACM international conference on Supercomputing
Multi-slicing: a compiler-supported parallel approach to data dependence profiling

Proceedings of the 2012 International Symposium on Software Testing and Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Among techniques for parallelizing sequential codes, privatization is a common and significant transformation performed by both compilers and runtime parallelizing systems. Without privatization, repetitive updates to the same data structures often introduce spurious data dependencies that hide the inherent parallelism. Unfortunately, it remains a significant challenge to compilers to automatically privatize dynamic and recursive data structures which appear frequently in real applications written in languages such as C/C++. This is because such languages lack a naming mechanism to define the address range of a pointer-based data structure, in contrast to arrays with explicitly declared bounds. In this paper we present a novel solution to this difficult problem by expanding general data structures such that memory accesses issued from different threads to contentious data structures are directed to different data fields. Based on compile-time type checking and a data dependence graph, this aggressive extension to the traditional scalar and array expansion isolates the address ranges among different threads, without struggling with privatization based on thread-private stacks, such that the targeted loop can be effectively parallelized. With this method fully implemented in GCC, experiments are conducted on a set of programs from well-known benchmark suites such as Mibench, MediaBench II and SPECint. Results show that the new approach can lead to a high speedup when executing the transformed code on multiple cores.