Compositional, Dynamic Cache Management for Embedded Chip Multiprocessors

Authors:
Anca M. Molnos;Sorin D. Cotofana;Marc J. Heijligers;Jos T. Eijndhoven
Affiliations:
NXP Semiconductors, Eindhoven, The Netherlands and Technical University of Delft, Delft, The Netherlands;Technical University of Delft, Delft, The Netherlands;NXP Semiconductors, Eindhoven, The Netherlands;Vector Fabrics, Eindhoven, The Netherlands
Venue:
Journal of Signal Processing Systems
Year:
2009

Citing 19
Cited 0

Exploring the design space for a shared-cache multiprocessor

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Compiler support for software-based cache partitioning

LCTES '95 Proceedings of the ACM SIGPLAN 1995 workshop on Languages, compilers, & tools for real-time systems
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Caches with Compositional Performance

Embedded Processor Design Challenges: Systems, Architectures, Modeling, and Simulation - SAMOS
Predictable Instruction Caching for Media Processors

ASAP '02 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures, and Processors
OS-Controlled Cache Predictability for Real-Time Systems

RTAS '97 Proceedings of the 3rd IEEE Real-Time Technology and Applications Symposium (RTAS '97)
Computer Architecture: A Quantitative Approach

Computer Architecture: A Quantitative Approach
Extending the reach of microprocessors: column and curious caching

Extending the reach of microprocessors: column and curious caching
Dynamic Partitioning of Shared Cache Memory

The Journal of Supercomputing
Compositional Memory Systems for Data Intensive Applications

Proceedings of the conference on Design, automation and test in Europe - Volume 1
CQoS: a framework for enabling QoS in shared caches of CMP platforms

Proceedings of the 18th annual international conference on Supercomputing
Effectively sharing a cache among threads

Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture

Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Predicting Inter-Thread Cache Contention on a Chip Multi-Processor Architecture

HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Compositional Memory Systems for Multimedia Communicating Tasks

Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
Compositional, efficient caches for a chip multi-processor

Proceedings of the conference on Design, automation and test in Europe: Proceedings
An Adaptive Shared/Private NUCA Cache Partitioning Scheme for Chip Multiprocessors

HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
A dynamically reconfigurable cache for multithreaded processors

Journal of Embedded Computing - Issues in embedded single-chip multicore architectures

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a dynamic cache repartitioning technique that enhances compositionality on platforms executing media applications with multiple utilization scenarios. Because the repartitioning between scenarios requires a cache flush, two undesired effects may occur: (1) in particular, the execution of critical tasks may be disturbed and (2) in general, a performance penalty is involved. To cope with these effects we propose a method which: (1) determines, at design time, the cache footprint of each tasks, such that it creates the premises for critical tasks safety, and minimum flush in general, and (2) enforces, at run-time, the design time determined cache footprints and further decreases the flush penalty. We implement our dynamic cache management strategy on a CAKE multiprocessor with 4 Trimedia cores. The experimental workload consists of 6 multimedia applications, each of which formed by multiple tasks belonging to an extended MediaBench suite. We found on average that: (1) the relative variations of critical tasks execution time are less than 0.1%, regardless of the scenario switching frequency, (2) for realistic scenario switching frequencies the inter-task cache interference is at most 4% for the repartitioned cache, whereas for the shared cache it reaches 68%, and (3) the off-chip memory traffic reduces with 60%, and the performance (in cycles per instruction) enhances with 10%, when compared with the shared cache.