System software for high end computing

  • Authors:
  • Patrick G. Bridges;Arthur B. MacCabe;Orran Krieger

  • Affiliations:
  • University of New Mexico, Albuquerque, NM;University of New Mexico, Albuquerque, NM;IBM T.J. Watson Research Center, Yorktown Heights, NY

  • Venue:
  • ACM SIGOPS Operating Systems Review
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

The challenges to the HEC system software community fundamentally originate from the need to efficiently exploit massive parallelism. This parallelism comes at either at the fine granularity through multiple cores or hardware accellerators, or at the large granularity, where systems have been built with 10,000 nodes connected over a high performance network. Such parallelism exposes new hardware abstractions that the OS needs to virtualize, introduces new reliability and management problems, new power management and file system issues, and generally requires more extensive software layers to protect programmers and system administrators from having to deal with the complexity of massive parallelism. While the HEC community clearly deals with extreme performance issues, many of the same trends are hitting the broader market. For example, all processor vendors are moving towards multiple cores, special purpose accelerators are becomming increasingly commoditized, and massive scale out systems are being used today by companies like Google.