PromisQoS: An Architecture for Delivering QoS to High-Performance Applications on Myrinet Clusters

  • Authors:
  • Jothi P. Neelamegam;Srigurunath Chakravarthi;Manoj Apte;Anthony Skjellum

  • Affiliations:
  • -;-;-;-

  • Venue:
  • LCN '03 Proceedings of the 28th Annual IEEE International Conference on Local Computer Networks
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clusters of workstations are being extensively used forsolving computationally intensive scientific problems.However, there is limited support for Quality of Service(QoS) based distributed computing on commercial off-the-shelf (COTS) clusters. This limitation has restricted successful deployment of distributed real-time high-performancecomputing applications to customized anddedicated embedded multi-processor systems. This paperdescribes research work that attempts to provide a clusterplatform that can guarantee access to computational andcommunication resources to distributed applications. Theauthors have developed PromisQoS, an architecture thatsupports execution of hard real-time distributedapplications on a Linux cluster while providing high-throughputand low-latency communication usingMyrinet. PromisQoS consists of the following majorcomponents -- Hare, BDM-RT and Turtle. Hare is aprototype implementation of time-based QoS channelsspecified by the Real-Time Message Passing Interface(MPI/RT 1.1) standard. BDM-RT is a low-level messaginglibrary on Myrinet that provides deterministiccommunication latency and bandwidth on Myrinet. Turtle,a variant of RT-Linux, is the real-time operating systemthat provides guaranteed computation time. This workdemonstrates that it is possible to deploy hard real-timedistributed applications on COTS clusters and underlinesthe significance of the MPI/RT API in the realm ofdistributed high-performance computing applications thatrequire QoS.