Scalable molecular dynamics for large biomolecular systems
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Supporting dynamic parallel object arrays
Proceedings of the 2001 joint ACM-ISCOPE conference on Java Grande
Emulating PetaFLOPS Machines and Blue Gene
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Hybrid technology multithreaded architecture
FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
Pursuing a Petaflop: Point Designs for 100 TF Computers Using PIM Technologies
FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
FlexRAM: Toward an Advanced Intelligent Memory System
ICCD '99 Proceedings of the 1999 IEEE International Conference on Computer Design
Simulation-based performance prediction for large parallel machines
International Journal of Parallel Programming - Special issue: The next generation software program
Robust non-intrusive record-replay with processor extraction
Proceedings of the 8th Workshop on Parallel and Distributed Systems: Testing, Analysis, and Debugging
Debugging large scale applications in a virtualized environment
LCPC'10 Proceedings of the 23rd international conference on Languages and compilers for parallel computing
Super-Scalable algorithms for computing on 100,000 processors
ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part I
Hi-index | 0.00 |
One approach for building the next generation of parallel computers is based on large aggregates of multiprocessor chips with support for hardware multithreading. An initial design for IBM's Blue Gene/C project exemplifies this approach. Such a machine might consist of a million processors, and is characterized by a low memory-to-processor ratio. To study alternate programming models for such a machine before it is built, we have developed an emulator that allows million-processor programs to be run on conventional parallel machines with hundreds of processors. Here we present the implementation of a parallel object model based on Charm++ as a candidate programming model. Although the "ideal" programming model for such machines is a matter of continuing research, we believe that parallel objects represent a good starting point. This paper reviews the target architecture, presents the programming model, and describes the emulator implementation. Case studies of simple applications written using the emulator are also discussed.