Balancing Contention and Synchronization on the Intel Paragon

  • Authors:
  • Shahid H. Bokhari;David M. Nicol

  • Affiliations:
  • -;-

  • Venue:
  • IEEE Parallel & Distributed Technology: Systems & Technology
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Intel Paragon is a mesh-connected, distributed-memory, message-passing parallel computer. It uses an oblivious and deterministic message-routing algorithm, which lets parallel programmers develop highly optimized schedules for frequently needed communication patterns. The complete exchange is one such pattern. Several approaches are available for carrying it out on the mesh. The authors study an algorithm developed by David Scott. This algorithm assumes that a communication link can carry only one message at a time and that a node can transmit only one message at a time. It requires global synchronization to enforce a schedule of transmissions. Unfortunately, global synchronization incurs substantial overhead on the Paragon. However, the machine's powerful interconnection mechanism permits two or three messages to share a communication link, with minor overhead. It can also overlap multiple message transmissions from the same node, to some extent. The authors develop a generalization of Scott's algorithm that executes complete exchange with a prescribed contention. Schedules that incur greater contention require fewer synchronization steps. This lets the authors trade off contention against synchronization overhead. The authors describe this algorithm's performance and compare it with the original algorithm and with a naive algorithm that does not take interconnection structure into account. The bounded-contention algorithm always improves the original algorithm and outperforms the naive algorithm for all but the smallest message sizes. The naive algorithm fails to work on meshes larger than 12 ( 12. These results show that due consideration of processor interconnect and machine-performance parameters is necessary to obtain peak performance from the Paragon and its successor mesh machines.