Harmonizing high performance computing (HPC) with large-scale complex systems in computational science and engineering

  • Authors:
  • Clyde Chittister;Yacov Y. Haimes

  • Affiliations:
  • Software Engineering Institute, Carnegie Mellon University, Pittsburgh, PA 15213;Center for Risk Management of Engineering Systems, University of Virginia, 112A Olsson Hall, Charlottesville, VA 22903

  • Venue:
  • Systems Engineering
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper addresses the challenges facing the three different groups within the professional community that support the development of large-scale scientific and engineering (CSE) software applications. Scientific and engineering systems are but one example of large-scale complex (LSC) systems. The authors recognize that many of the observations made in this paper also apply to the larger domain. The focus of this paper is restricted to CSE because it is the source of the experiences from which the observations are derived. The first of the groups encompasses the application developers of large-scale scientific and engineering software systems, especially those requiring high-performance computing (HPC). The second group covers the HPC software development and run-time environments. The third consists of the integrators of the first two groups, with a focus on the systems engineers whose task is to bridge the technical and cultural gap between the other two. These challenges reside in several areas, the most important being the educational and cultural backgrounds that are reflected in the knowledge, expertise, and experience of the principals involved. In particular, the cultural gaps and thus the communications and ultimate systems integration challenges are the byproducts of the academic educational system that graduates most software engineers without formal courses in systems engineering, process control, and systems integration. All three groups of challenges are explored individually and in conjunction with each other. Thus, a major problem addressed here is centered in software, because the developers of LSC systems in CSE perceive hardware through the HPC software development environment. Given the unique attributes and characteristics of LSC-HPC systems, to meet the challenge of the risk modeling, assessment, and management associated with them—at least partially—this paper discusses two complementary approaches: decomposition and hierarchical modeling, which, when applied in conjunction with conventional aggregate modeling methods, offers several promising advantages, especially when dealing with LSC-HPC systems, and Phantom System Models, used as a real-time and virtual modeling laboratory for systems integration and risk assessment and management. © 2009 Wiley Periodicals, Inc. Syst Eng