Characterizing computer performance with a single number
Communications of the ACM
Scalable Service Differentiation in a Shared Storage Cache
ICDCS '03 Proceedings of the 23rd International Conference on Distributed Computing Systems
CQoS: a framework for enabling QoS in shared caches of CMP platforms
Proceedings of the 18th annual international conference on Supercomputing
Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Architectural support for operating system-driven CMP cache management
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
SPEC CPU2006 benchmark descriptions
ACM SIGARCH Computer Architecture News
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
QoS policies and architecture for cache/memory in CMP platforms
Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Cooperative cache partitioning for chip multiprocessors
Proceedings of the 21st annual international conference on Supercomputing
A Framework for Providing Quality of Service in Chip Multi-Processors
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Rate-based QoS techniques for cache/memory in CMP platforms
Proceedings of the 23rd international conference on Supercomputing
SHARP control: controlled shared cache management in chip multiprocessors
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Statistical Modeling and Analysis for Complex Data Problems
Statistical Modeling and Analysis for Complex Data Problems
Hi-index | 0.00 |
We present a fully-automated, model based, multilayer cache partitioning scheme for multiprogram workloads running on multicore machines. As opposed to prior efforts, this scheme partitions shared caches at multiple layers simultaneously in a coordinated fashion. This scheme tries to achieve two objectives. First, it tries to satisfy the specified quality of service (QoS) values for all applications by partitioning the shared cache hierarchy across them, and second, it distributes the remaining excess cache capacity (if any) across applications such that a global performance metric is maximized. Our experimental analysis shows that the proposed multilayer partitioning scheme generates, on average, 33.1% improvement (on the weighted speedup metric) over the next best-performing scheme and is very successful in satisfying the QoS requirements of applications. Also, we show that partitioning each layer in isolation cannot generate the benefits obtained through our coordinated partitioning scheme. In addition, we observed that the difference between our scheme and an optimal scheme (that derives best dynamic partitions) was less than 15% for all the workloads tested and 6.6% on average.