The physical mapping problem for parallel architectures

  • Authors:
  • Lenwood S. Heath;Arnold L. Rosenberg;Bruce T. Smith

  • Affiliations:
  • Virginia Polytechnic Institute, Blacksburg;Univ. of Massachusetts, Amherst;Univ. of North Carolina, Chapel Hill

  • Venue:
  • Journal of the ACM (JACM)
  • Year:
  • 1988

Quantified Score

Hi-index 0.00

Visualization

Abstract

The problem of realizing an idealized parallel architecture on a (possibly fault-laden) physical architecture is studied. Our formulation performs the mapping in the light of the algorithm that one wants to implement on the idealized architecture. A version of the mapping algorithm suggested by the DIOGENES methodology for designing fault-tolerant VLSI processor arrays is settled definitely. Two quality metrics for mappings are considered, the first embodying an idealized notion of average delay, which relates to power consumption, and the second being the length of the longest run of wire. For the average-delay measure, four algorithms that optimally assign the m vertices of the embedded graph to the n fault-free processors that have been fabricated are presented. The most general algorithm makes no assumptions about the structure of the array or the physical format of the processors; it runs in time O(m · (n - m)2). The other algorithms assume that the processors are laid out in such a way that interprocessor distances obey the triangle equality; they run in times ranging from time O(max{m, n - m} · log min {m, n - m}) for certain array structures, including linear arrays, to time O(max{m, n - m}) for a narrow class of array structures, including pyramid arrays. For the max-wire-run cost measure, it is shown that the problem of finding cost-optimal vertex-to-processor assignments is NP-complete. However, an algorithm is presented that yields, in time O(m · (n - m)2), vertex-to-processor assignments that are within a factor of 3 of optimal (they are optimal when the input graph-embedding is outplanar). This algorithm can easily be converted to one that yields, in time O(m · (n - m)3), vertex-to-processor assignments that are within a factor of 2 of optimal. Finally, an algorithm that yields optimal assignments when the interprocessor distances obey the triangle equality is presented; this algorithm operates in time O(m · (n - m) · log(m · (n - m)) · log M), where M is the largest interprocessor distance.