Fat-trees: universal networks for hardware-efficient supercomputing
IEEE Transactions on Computers
The mythical man-month (anniversary ed.)
The mythical man-month (anniversary ed.)
Co-array Fortran for parallel programming
ACM SIGPLAN Fortran Forum
Software product lines: a case study
Software—Practice & Experience
The structure of the “THE”-multiprogramming system
Communications of the ACM
A comparative study of the NAS MG benchmark across parallel languages and architectures
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Agile software development
Software Engineering Economics
Software Engineering Economics
Principles and Practices of Interconnection Networks
Principles and Practices of Interconnection Networks
Java(TM) Language Specification, The (3rd Edition) (Java (Addison-Wesley))
Java(TM) Language Specification, The (3rd Edition) (Java (Addison-Wesley))
HPC Productivity: An Overarching View
International Journal of High Performance Computing Applications
Measuring High Performance Computing Productivity
International Journal of High Performance Computing Applications
High Performance Computing Productivity Model Synthesis
International Journal of High Performance Computing Applications
Modeling Coordinated Checkpointing for Large-Scale Supercomputers
DSN '05 Proceedings of the 2005 International Conference on Dependable Systems and Networks
Case study of the Falcon code project
Proceedings of the second international workshop on Software engineering for high performance computing system applications
Can software engineering solve the HPCS problem?
Proceedings of the second international workshop on Software engineering for high performance computing system applications
Proceedings of the second international workshop on Software engineering for high performance computing system applications
Software Development Environments for Scientific and Engineering Software: A Series of Case Studies
ICSE '07 Proceedings of the 29th international conference on Software Engineering
Parallel Programmability and the Chapel Language
International Journal of High Performance Computing Applications
Some Problems of Professional End User Developers
VLHCC '07 Proceedings of the IEEE Symposium on Visual Languages and Human-Centric Computing
Development of a Weather Forecasting Code: A Case Study
IEEE Software
Hi-index | 0.00 |
Supercomputer designers traditionally focus on low-level hardware performance criteria such as CPU cycle speed, disk bandwidth, and memory latency. The High-Performance Computing (HPC) community has more recently begun to realize that escalating hardware performance is, by itself, contributing less and less to real productivity-the ability to develop and deploy high-performance supercomputer applications at acceptable time and cost. The Defense Advanced Research Projects Agency (DARPA) High Productivity Computing Systems (HPCS) initiative challenged industry vendors to design a new generation of supercomputers that would deliver a 10x improvement in this newly acknowledged but poorly understood domain of real productivity. Sun Microsystems, choosing to abandon customary evolutionary approaches, responded with two revolutionary decisions. The first was to investigate the nature of supercomputer productivity in the full context of use, which includes people, organizations, goals, practices, and skills as well as processors, disks, memory, and software. The second decision was to rethink completely the design of supercomputing systems, informed by productivity-based requirements and driven by recent technological breakthroughs. Crucial to the implementation of these decisions was the establishment of multidisciplinary, closely collaborating teams that conducted research into productivity and developed the many closely intertwined design decisions needed to meet DARPA's challenge. Among the most significant results from Sun's productivity research was a detailed diagnosis of software development as the dominant barrier to productivity improvements in the HPC community. The level of expertise required, combined with the amount of effort needed to develop conventional HPC codes, has already created a crisis of productivity. Even worse, there is no path forward within the existing paradigm that will significantly increase productivity as hardware systems scale up. The same issues also prevent HPC from "scaling out" to a broader class of applications. This diagnosis led to design requirements that address specific issues behind the expertise and effort bottlenecks. Sun's design teams explored complex, system-wide tradeoffs needed to meet these requirements in all aspects of the design, including reliability, performance, programmability, and ease of administration. These tradeoffs drew on technological advances in massive chip multithreading, extremely high-performance interconnects, resource virtualization, and programming language design. The outcome was the design for a machine to operate at petascale, with extremely high reliability and a greatly simplified programming model. Although this design supports existing codes and software technologies-crucial requirements-it also anticipates that the greatest productivity breakthroughs will follow from dramatic changes in how HPC codes are developed, changes that require a system of the type designed by Sun's HPCS team.