Software pipelining: an effective scheduling technique for VLIW machines
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Proceedings of the International Sympoisum on Theoretical Programming
First version of a data flow procedure language
Programming Symposium, Proceedings Colloque sur la Programmation
Earth: an efficient architecture for running threads
Earth: an efficient architecture for running threads
Code Generation for Single-Dimension Software Pipelining of Multi-Dimensional Loops
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
X10: an object-oriented approach to non-uniform cluster computing
OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Parallel Programmability and the Chapel Language
International Journal of High Performance Computing Applications
OpenMP 3.0 --- A Preview of the Upcoming Standard
HPCC '07 Proceedings of the 3rd international conference on High Performance Computing and Communications
Languages and Compilers for Parallel Computing
The MPI 2.2 Standard and the Emerging MPI 3 Standard
Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Extending the OpenMP tasking model to allow dependent tasks
IWOMP'08 Proceedings of the 4th international conference on OpenMP in a new era of parallelism
Scientific Programming - Exploring Languages for Expressing Medium to Massive On-Chip Parallelism
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Towards a codelet-based runtime for exascale computing: position paper
Proceedings of the 2nd International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era
For extreme parallelism, your OS is Sooooo last-millennium
HotPar'12 Proceedings of the 4th USENIX conference on Hot Topics in Parallelism
Transparently consistent asynchronous shared memory
Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers
An implementation of the codelet model
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
Hi-index | 0.00 |
As computing has moved relentlessly through giga-, tera-, and peta-scale systems, exa-scale (a million trillion operations/sec.) computing is currently under active research. DARPA has recently sponsored the "UHPC" [1] --- ubiquitous high-performance computing --- program, encouraging partnership with academia and industry to explore such systems. Among the requirements are the development of novel techniques in "self-awareness"in support of performance, energy-efficiency, and resiliency. Trends in processor and system architecture, driven by power and complexity, point us toward very high-core-count designs and extreme software parallelism to solve exascaleclass problems. Our research is exploring a fine-grain, event-driven model in support of adaptive operation of these machines. We are developing a Codelet Program Execution Model which breaks applications into codelets (small bits of functionality) and dependencies (control and data) between these objects. It then uses this decomposition to accomplish advanced scheduling, to accommodate code and data motion within the system, and to permit flexible exploitation of parallelism in support of goals for performance and power.