Compositional, efficient caches for a chip multi-processor

  • Authors:
  • A. M. Molnos;M. J. M. Heijligers;S. D. Cotofana;J. T. J. van Eijndhoven

  • Affiliations:
  • Delft University of Technology, Delft, The Netherlands;Philips Research Laboratories, Eindhoven, The Netherlands;Delft University of Technology, Delft, The Netherlands;Philips Research Laboratories, Eindhoven, The Netherlands

  • Venue:
  • Proceedings of the conference on Design, automation and test in Europe: Proceedings
  • Year:
  • 2006

Quantified Score

Hi-index 0.01

Visualization

Abstract

In current multi-media systems major parts of the functionality consist of software tasks executed on a set of concurrently operating processors. Those tasks interfere with each other when they share memory and other hardware components. For instance when the tasks share caches and no precautions are taken they potentially flush each other's data at random. In this case the control over the system performance is lost. However, in media processing the performance must be under tight control. In particular the performance of each individual task must be preserved if the tasks are executed concurrently in arbitrary combinations or if additional tasks are added. A system satisfying this property is addressed as being compositional.This paper proposes a novel cache partitioning technique that enhances compostionality. We assume a cache to be a rectangular array of memory elements arranged in "sets" (rows) and "ways" (columns). We perform two partitioning types. First, each task and each inter-task common data gets an exclusive part of the cache sets. Second, inside the cache sets of common data each task accessing it gets a number of ways. We apply the proposed method on a homogeneous multiprocessor using two applications: H.264 decoding and picture-in-picture-TV. Our experiments indicate that, for both applications, under our partitioning scheme the sum of misses of the individual tasks executed separately and the number of misses of all tasks executed concurrently differs at most by 4%. We conclude that compositionality is achieved within reasonable bounds. Additionally, our technique appears to improve the efficiency of the cache operation.