An Approach towards an Analytical Characterization of Locality and its Portability

  • Authors:
  • Affiliations:
  • Venue:
  • IWIA '01 Proceedings of the Innovative Architecture for Future Generation High-Performance Processors and Systems (IWIA'01)
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Abstract: The evolution of computing technology towards the ultimate physical limits makes communication the dominant cost of computing. It would then be desirable to have a framework for the study of locality, which we define as the property of an algorithm that enables implementations with reduced communication overheads. We view as part of the algorithm all those aspects that define the operations to be performed and their data dependences, at a functional level. We view as part of the implementation all those aspects that pertain to the use of machine resources during execution of the algorithm, such as operation scheduling, memory management, and message routing.We discuss the issue of useful characterizations of the locality of an algorithm with reference to both single machines and classes of machines. We then consider the question of portability of locality, viewed as the existence of a single implementation which is optimal across a class of machines. We also formulate a less stringent notion of portability, where the implementation is allowed to be parametrized, so as to adapt to the executing platform, and suboptimal performance is accepted, within specified bounds. We illustrate the proposed approach with its application to the study of temporal locality, the property of an algorithm that enables efficient implementations on machines where memory accesses have a variable latency depending on the location being accessed. As a first step, we only consider serial implementations, which can be viewed as defined by the choice of an operation schedule and by a memory management. We discuss how, for a fixed operation schedule, temporal locality can be characterized for interesting classes of uniform hierarchical machines by a set of metrics, the width lengths of the schedule, which are only logarithmically many in the number of operations. Moreover, a portable memory management of any schedule can be obtained for such classes of machines. The situation becomes more complex when the schedule is a degree of freedom of the implementation. Then, while some computations do admit a single schedule optimal across any machines, this is not always the case. Thus, in general, only the less stringent notion of portability, based on parametrized schedules, can be pursued. Correspondingly, a concise characterization of temporal locality becomes harder to achieve and still remains an interesting open problem.