Handling task dependencies under strided and aliased references

Authors:
Josep M. Perez;Rosa M. Badia;Jesus Labarta
Affiliations:
Universitat Politècnica de Catalunya (UPC-DAC);Spanish National Research Council (CSIC - IIIA);Universitat Politècnica de Catalunya (UPC-DAC)
Venue:
Proceedings of the 24th ACM International Conference on Supercomputing
Year:
2010

Citing 10
Cited 6

Direct parallelization of call statements

SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
LAPACK Users' guide (third ed.)

LAPACK Users' guide (third ed.)
Efficient and precise array access analysis

ACM Transactions on Programming Languages and Systems (TOPLAS)
Hybrid analysis: static & dynamic memory reference analysis

ICS '02 Proceedings of the 16th international conference on Supercomputing
An Implementation of Interprocedural Bounded Regular Section Analysis

IEEE Transactions on Parallel and Distributed Systems
Interprocedural dependence analysis and parallelization

ACM SIGPLAN Notices - Best of PLDI 1979-1999
X10: an object-oriented approach to non-uniform cluster computing

OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Parallel Programmability and the Chapel Language

International Journal of High Performance Computing Applications
Improving the accuracy of snoop filtering using stream registers

MEDEA '07 Proceedings of the 2007 workshop on MEmory performance: DEaling with Applications, systems and architecture
CellSs: making it easier to program the cell broadband engine processor

IBM Journal of Research and Development

Legion: expressing locality and independence with logical regions

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
OpenStream: Expressiveness and data-flow compilation of OpenMP streaming programs

ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Transactional access to shared memory in starss, a task based programming model

Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Implementing OmpSs support for regions of data in architectures with multiple address spaces

Proceedings of the 27th international ACM conference on International conference on supercomputing
Prefetching and cache management using task lifetimes

Proceedings of the 27th international ACM conference on International conference on supercomputing
Analysis of dependence tracking algorithms for task dataflow execution

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.00

Visualization

Abstract

The emergence of multicore processors has increased the need for simple parallel programming models usable by nonexperts. The ability to specify subparts of a bigger data structure is an important trait of High Productivity Programming Languages. Such a concept can also be applied to dependency-aware task-parallel programming models. In those paradigms, tasks may have data dependencies, and those are used for scheduling them in parallel. However, calculating dependencies between subparts of bigger data structures is challenging. Accessed data may be strided, and can fully or partially overlap the accesses of other tasks. Techniques that are too approximate may produce too many extra dependencies and limit parallelism. Techniques that are too precise may be impractical in terms of time and space. We present the abstractions, data structures and algorithms to calculate dependencies between tasks with strided and possibly different memory access patterns. Our technique is performed at run time from a description of the inputs and outputs of each task and is not affected by pointer arithmetic nor reshaping. We demonstrate how it can be applied to increase programming productivity. We also demonstrate that scalability is comparable to other solutions and in some cases higher due to better parallelism extraction.