Interprocedural dependence analysis and parallelization
SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
The program dependence graph and its use in optimization
ACM Transactions on Programming Languages and Systems (TOPLAS)
Automatic translation of FORTRAN programs to vector form
ACM Transactions on Programming Languages and Systems (TOPLAS)
Semantical interprocedural parallelization: an overview of the PIPS project
ICS '91 Proceedings of the 5th international conference on Supercomputing
Fortran 90 handbook: complete ANSI/ISO reference
Fortran 90 handbook: complete ANSI/ISO reference
Run-time methods for parallelizing partially parallel loops
ICS '95 Proceedings of the 9th international conference on Supercomputing
Parallel Computing - Special double issue on environment and tools for parallel scientific computing
Maximizing parallelism and minimizing synchronization with affine transforms
Proceedings of the 24th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Programming with POSIX threads
Programming with POSIX threads
Array SSA form and its use in parallelization
POPL '98 Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
The implementation of the Cilk-5 multithreaded language
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Data speculation support for a chip multiprocessor
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Optimizing compilers for modern architectures: a dependence-based approach
Optimizing compilers for modern architectures: a dependence-based approach
A stream compiler for communication-exposed architectures
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Interactive Parallel Programming using the ParaScope Editor
IEEE Transactions on Parallel and Distributed Systems
Dynamic Dependence Analysis: A Novel Method for Data Depndence Evaluation
Proceedings of the 5th International Workshop on Languages and Compilers for Parallel Computing
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Unified Analysis of Array and Object References in Strongly Typed Languages
SAS '00 Proceedings of the 7th International Symposium on Static Analysis
Standard Templates Adaptive Parallel Library (STAPL)
LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Effective Representation of Aliases and Indirect Memory Operations in SSA Form
CC '96 Proceedings of the 6th International Conference on Compiler Construction
Extended SSA numbering: introducing SSA properties to languages with multi-level pointers
CASCON '96 Proceedings of the 1996 conference of the Centre for Advanced Studies on Collaborative research
A performance analysis of the Berkeley UPC compiler
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
The Jrpm system for dynamically parallelizing Java programs
Proceedings of the 30th annual international symposium on Computer architecture
Automatic Thread Extraction with Decoupled Software Pipelining
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Bulk Disambiguation of Speculative Threads in Multiprocessors
Proceedings of the 33rd annual international symposium on Computer Architecture
A two-phase escape analysis for parallel java programs
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
IEICE - Transactions on Information and Systems
X10: concurrent programming for modern architectures
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Optimistic parallelism requires abstractions
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Software behavior oriented parallelization
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Making context-sensitive points-to analysis with heap cloning practical for the real world
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
An annotation language for optimizing software libraries
DSL'99 Proceedings of the 2nd conference on Conference on Domain-Specific Languages - Volume 2
Sensitivity analysis for automatic parallelization on multi-cores
Proceedings of the 21st annual international conference on Supercomputing
Implicitly parallel programming models for thousand-core microprocessors
Proceedings of the 44th annual Design Automation Conference
Revisiting the Sequential Programming Model for Multi-Core
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
A Practical Approach to Exploiting Coarse-Grained Pipeline Parallelism in C Programs
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Extracting coarse-grain parallelism in general-purpose programs
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Parallel-stage decoupled software pipelining
Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Embla - Data Dependence Profiling for Parallel Programming
CISIS '08 Proceedings of the 2008 International Conference on Complex, Intelligent and Software Intensive Systems
Copy or Discard execution model for speculative parallelization on multicores
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Commutative set: a language extension for implicit parallel programming
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
ALTER: exploiting breakable dependences for parallelization
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Programmer-assisted automatic parallelization
Proceedings of the 2011 Conference of the Center for Advanced Studies on Collaborative Research
POET: a scripting language for applying parameterized source-to-source program transformations
Software—Practice & Experience
A model-driven approach for software parallelization
MODELS'11 Proceedings of the 2011th international conference on Models in Software Engineering
Parcae: a system for flexible parallel execution
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Speculative separation for privatization and reductions
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Fast loop-level data dependence profiling
Proceedings of the 26th ACM international conference on Supercomputing
Multi-slicing: a compiler-supported parallel approach to data dependence profiling
Proceedings of the 2012 International Symposium on Software Testing and Analysis
A data dependence test based on the projection of paths over shape graphs
Journal of Parallel and Distributed Computing
Portable section-level tuning of compiler parallelized applications
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
General data structure expansion for multi-threading
Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation
Fast condensation of the program dependence graph
Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation
Integrating profile-driven parallelism detection and machine-learning-based mapping
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
Speeding up sequential programs on multicores is a challenging problem that is in urgent need of a solution. Automatic parallelization of irregular pointer-intensive codes, exemplified by the SPECint codes, is a very hard problem. This paper shows that, with a helping hand, such auto-parallelization is possible and fruitful. This paper makes the following contributions: (i) A compiler-framework for extracting pipeline-like parallelism from outer program loops is presented. (ii) Using a light-weight programming model based on annotations, the programmer helps the compiler to find thread-level parallelism. Each of the annotations specifies only a small piece of semantic information that compiler analysis misses, e.g. stating that a variable is dead at a certain program point. The annotations are designed such that correctness is easily verified. Furthermore, we present a tool for suggesting annotations to the programmer. (iii) The methodology is applied to auto-parallelize several SPECint benchmarks. For the benchmark with most parallelism (hmmer), we obtain a scalable 7-fold speedup on an AMD quad-core dual processor. The annotations constitute a parallel programming model that relies extensively on a sequential program representation. Hereby, the complexity of debugging is not increased and it does not obscure the source code. These properties could prove valuable to increase the efficiency of parallel programming.