Exploring the design space for a shared-cache multiprocessor
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Compiler support for software-based cache partitioning
LCTES '95 Proceedings of the ACM SIGPLAN 1995 workshop on Languages, compilers, & tools for real-time systems
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Caches with Compositional Performance
Embedded Processor Design Challenges: Systems, Architectures, Modeling, and Simulation - SAMOS
Predictable Instruction Caching for Media Processors
ASAP '02 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures, and Processors
OS-Controlled Cache Predictability for Real-Time Systems
RTAS '97 Proceedings of the 3rd IEEE Real-Time Technology and Applications Symposium (RTAS '97)
Computer Architecture: A Quantitative Approach
Computer Architecture: A Quantitative Approach
Extending the reach of microprocessors: column and curious caching
Extending the reach of microprocessors: column and curious caching
Dynamic Partitioning of Shared Cache Memory
The Journal of Supercomputing
Compositional Memory Systems for Data Intensive Applications
Proceedings of the conference on Design, automation and test in Europe - Volume 1
CQoS: a framework for enabling QoS in shared caches of CMP platforms
Proceedings of the 18th annual international conference on Supercomputing
Effectively sharing a cache among threads
Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Predicting Inter-Thread Cache Contention on a Chip Multi-Processor Architecture
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Compositional Memory Systems for Multimedia Communicating Tasks
Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
Compositional, efficient caches for a chip multi-processor
Proceedings of the conference on Design, automation and test in Europe: Proceedings
An Adaptive Shared/Private NUCA Cache Partitioning Scheme for Chip Multiprocessors
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
A dynamically reconfigurable cache for multithreaded processors
Journal of Embedded Computing - Issues in embedded single-chip multicore architectures
Hi-index | 0.00 |
This paper proposes a dynamic cache repartitioning technique that enhances compositionality on platforms executing media applications with multiple utilization scenarios. Because the repartitioning between scenarios requires a cache flush, two undesired effects may occur: (1) in particular, the execution of critical tasks may be disturbed and (2) in general, a performance penalty is involved. To cope with these effects we propose a method which: (1) determines, at design time, the cache footprint of each tasks, such that it creates the premises for critical tasks safety, and minimum flush in general, and (2) enforces, at run-time, the design time determined cache footprints and further decreases the flush penalty. We implement our dynamic cache management strategy on a CAKE multiprocessor with 4 Trimedia cores. The experimental workload consists of 6 multimedia applications, each of which formed by multiple tasks belonging to an extended MediaBench suite. We found on average that: (1) the relative variations of critical tasks execution time are less than 0.1%, regardless of the scenario switching frequency, (2) for realistic scenario switching frequencies the inter-task cache interference is at most 4% for the repartitioned cache, whereas for the shared cache it reaches 68%, and (3) the off-chip memory traffic reduces with 60%, and the performance (in cycles per instruction) enhances with 10%, when compared with the shared cache.