Data dependence and its application to parallel processing
International Journal of Parallel Programming
Algorithmic skeletons: structured management of parallel computation
Algorithmic skeletons: structured management of parallel computation
Crystal: theory and pragmatics of generating efficient parallel code
Parallel functional languages and compilers
Static analysis of upper and lower bounds on dependences and parallelism
ACM Transactions on Programming Languages and Systems (TOPLAS)
Compiler transformations for high-performance computing
ACM Computing Surveys (CSUR)
Static and Dynamic Evaluation of Data Dependence Analysis Techniques
IEEE Transactions on Parallel and Distributed Systems
Automatic parallelization of divide and conquer algorithms
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Automatic parallelization of recursive procedures
International Journal of Parallel Programming - Special issue on parallel architectures and compilation techniques
Optimizing compilers for modern architectures: a dependence-based approach
Optimizing compilers for modern architectures: a dependence-based approach
Efficient Parallelisation of Recursive Problems Using Constructive Recursion (Research Note)
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Patterns and skeletons for parallel and distributed computing
Patterns and skeletons for parallel and distributed computing
FCCM '00 Proceedings of the 2000 IEEE Symposium on Field-Programmable Custom Computing Machines
Advances in dataflow programming languages
ACM Computing Surveys (CSUR)
Sequoia: programming the memory hierarchy
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Sequoia: programming the memory hierarchy
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Iterative optimization in the polyhedral model: part ii, multidimensional time
Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
FPGA Architecture: Survey and Challenges
Foundations and Trends in Electronic Design Automation
Intel threading building blocks
Intel threading building blocks
Sorting networks and their applications
AFIPS '68 (Spring) Proceedings of the April 30--May 2, 1968, spring joint computer conference
How much parallelism is there in irregular applications?
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
STAPL: standard template adaptive parallel library
Proceedings of the 3rd Annual Haifa Experimental Systems Conference
Recursion-driven parallel code generation for multi-core platforms
Proceedings of the Conference on Design, Automation and Test in Europe
The RLOC is dead - long live the RLOC
Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
Functional and dynamic programming in the design of parallel prefix networks
Journal of Functional Programming
Where is the data? Why you cannot debate CPU vs. GPU performance without the answer
ISPASS '11 Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software
An experimental evaluation of data dependence analysis techniques
IEEE Transactions on Parallel and Distributed Systems
Hi-index | 0.00 |
One of the major issues in parallelizing applications is to deal with the inherent dependency structure of the program. Dependence analysis provides execution-order constraints between program statements, and can establish legitimate ways to carry out program code transformations. The concept of data dependency constitutes one class of dependencies obtained through dependence analysis, a form related to data parallelism. Since automatic dependence analysis has proved to be too complex for the general case, parallelizing compilers cannot help parallelizing every dependency pattern. In many cases, the data dependency pattern of a computation is independent from the actual data values, i.e., it is static, though the pattern may scale with the size of the data set. In this paper, we explore how a static, scalable data dependency can be presented to the compiler in a meaningful way. We describe the major components of a proposed framework in which static and possibly scalable data dependencies are turned into programmable entities. The framework provides a high, and easy to manipulate, level to deal with data distribution and placement of computations onto any parallel system which has a well defined space-time communication structure. The data dependency information together with the placement information can be utilised by a compiler to generate parallel code. This presentation explores the idea of programmable data placements in more detail through concrete examples for the CUDA API of Nvidia GPUs.