The case for a single-chip multiprocessor
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Xen and the art of virtualization
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
CQoS: a framework for enabling QoS in shared caches of CMP platforms
Proceedings of the 18th annual international conference on Supercomputing
Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Predicting Inter-Thread Cache Contention on a Chip Multi-Processor Architecture
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Intel Virtualization Technology
Computer
Architectural support for operating system-driven CMP cache management
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Communist, utilitarian, and capitalist cache policies on CMPs: caches as a shared resource
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Globally-Synchronized Frames for Guaranteed Quality-of-Service in On-Chip Networks
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
A case for integrated processor-cache partitioning in chip multiprocessors
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
SHARP control: controlled shared cache management in chip multiprocessors
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Resource sharing in performance models
EPEW'07 Proceedings of the 4th European performance engineering conference on Formal methods and stochastic models for performance evaluation
Replacement policies for shared caches on symmetric multicores: a programmer-centric point of view
Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
Virtualizing network-on-chip resources in chip-multiprocessors
Microprocessors & Microsystems
METE: meeting end-to-end QoS in multicores through system-wide resource management
Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Vantage: scalable and efficient fine-grain cache partitioning
Proceedings of the 38th annual international symposium on Computer architecture
METE: meeting end-to-end QoS in multicores through system-wide resource management
ACM SIGMETRICS Performance Evaluation Review - Performance evaluation review
The gradient-based cache partitioning algorithm
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Region scheduling: efficiently using the cache architectures via page-level affinity
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Network-on-Chip virtualization in Chip-Multiprocessor Systems
Journal of Systems Architecture: the EUROMICRO Journal
Courteous cache sharing: being nice to others in capacity management
Proceedings of the 49th Annual Design Automation Conference
Globally Synchronized Frames for guaranteed quality-of-service in on-chip networks
Journal of Parallel and Distributed Computing
Survey of scheduling techniques for addressing shared resources in multicore processors
ACM Computing Surveys (CSUR)
Supporting faulty banks in NUCA by NoC assisted remapping mechanisms
The Journal of Supercomputing
Hi-index | 0.00 |
As more and more cores are enabled on the die of future CMP platforms, we expect that several diverse workloads will run simultaneously on the platform. A key example of this trend is the growth of virtualization usage models. When multiple virtual machines or applications or threads run simultaneously, the quality of service (QoS) that the platform provides to each individual thread is non-deterministic today. This occurs because the simultaneously running threads place very different demands on the shared resources (cache space, memory bandwidth, etc) in the platform and in most cases contend with each other. In this paper, we first present case studies that show how this results in non-deterministic performance. Unlike the compute resources managed through scheduling, platform resource allocation to individual threads cannot be controlled today. In order to provide better determinism and QoS, we then examine resource management mechanisms and present QoS-aware architectures and execution environments. The main contribution of this paper is the architecture feasibility analysis through prototypes that allow experimentation with QoS-Aware execution environments and architectural resources. We describe these QoS prototypes and then present preliminary case studies of multi-tasking and virtualization usage models sharing one critical CMP resource (last-level cache). We then demonstrate how proper management of the cache resource can provide service differentiation and deterministic performance behavior when running disparate workloads in future CMP platforms.