An integrated compilation and performance analysis environment for data parallel programs
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
NAMD2: greater scalability for parallel molecular dynamics
Journal of Computational Physics - Special issue on computational molecular biophysics
Supporting dynamic parallel object arrays
Proceedings of the 2001 joint ACM-ISCOPE conference on Java Grande
Visualizing the Performance of Parallel Programs
IEEE Software
Adaptive Load Balancing for MPI Programs
ICCS '01 Proceedings of the International Conference on Computational Science-Part II
NAMD: biomolecular simulation on thousands of processors
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
(R) Towards Automatic Performance Analysis
ICPP '96 Proceedings of the Proceedings of the 1996 International Conference on Parallel Processing - Volume 3
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
An orchestration language for parallel objects
LCR '04 Proceedings of the 7th workshop on Workshop on languages, compilers, and run-time support for scalable systems
Charisma: orchestrating migratable parallel objects
Proceedings of the 16th international symposium on High performance distributed computing
Evaluating similarity-based trace reduction techniques for scalable performance analysis
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Support for adaptivity in ARMCI using migratable objects
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Achieving strong scaling with NAMD on blue Gene/L
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Proactive fault tolerance in MPI applications via task migration
HiPC'06 Proceedings of the 13th international conference on High Performance Computing
Trace profiling: Scalable event tracing on high-end parallel systems
Parallel Computing
Optimizing fine-grained communication in a biomolecular simulation application on Cray XK6
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Towards scalable event tracing for high end systems
HPCC'07 Proceedings of the Third international conference on High Performance Computing and Communications
Understanding the formation of wait states in applications with one-sided communication
Proceedings of the 20th European MPI Users' Group Meeting
Hi-index | 0.00 |
Some of the most challenging applications to parallelize scalably are the ones that present a relatively small amount of computation per iteration. Multiple interacting performance challenges must be identified and solved to attain high parallel efficiency in such cases. We present a case study involving NAMD, a parallel molecular dynamics application, and efforts to scale it to run on 3000 processors with Tera-FLOPS level performance. NAMD is implemented in Charm++, and the performance analysis was carried out using "projections", the performance visualization/analysis tool associated with Charm++. We will showcase a series of optimizations facilitated by projections. The resultant performance of NAMD led to a Gordon Bell award at SC2002.