An efficient pipelined dataflow processor architecture
Proceedings of the 1988 ACM/IEEE conference on Supercomputing
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
The implementation of the Cilk-5 multithreaded language
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
The Cilk System for Parallel Multithreaded Computing
The Cilk System for Parallel Multithreaded Computing
An energy efficient TLB design methodology
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Parallel Programming and Parallel Abstractions in Fortress
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
X10: an object-oriented approach to non-uniform cluster computing
OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
MPI at Exascale: Challenges for Data Structures and Algorithms
Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
The multikernel: a new OS architecture for scalable multicore systems
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Corey: an operating system for many cores
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Experiences with a Lightweight Supercomputer Kernel: Lessons Learned from Blue Gene's CNK
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Using a "codelet" program execution model for exascale machines: position paper
Proceedings of the 1st International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era
Habanero-Java: the new adventures of old X10
Proceedings of the 9th International Conference on Principles and Practice of Programming in Java
Region scheduling: efficiently using the cache architectures via page-level affinity
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Towards a codelet-based runtime for exascale computing: position paper
Proceedings of the 2nd International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era
An implementation of the codelet model
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
Hi-index | 0.00 |
High-performance computing has been on an inexorable march from gigascale to tera-and petascale, with many researchers now actively contemplating exascale (1018, or a million trillion operations per second) systems. This progression is being accelerated by the rapid increase in multi-and many-core processors, which allow even greater opportunities for parallelism. Such densities, though, give rise to a new cohort of challenges; for example, containing system software overhead, dealing with large numbers of schedulable entities, and maintaining energy efficiency. We are studying software and processor-architectural features that will allow us to achieve these goals. We believe that exascale operation will require significant "out of the box" thinking, specifically in terms of the role of operating systems and system software. We describe some of our research into how these goals can be achieved.