ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
Symbiotic jobscheduling for a simultaneous multithreaded processor
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Focusing processor policies via critical-path prediction
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Orion: a power-performance simulator for interconnection networks
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Predictable performance in SMT processors
Proceedings of the 1st conference on Computing frontiers
A First-Order Superscalar Processor Model
Proceedings of the 31st annual international symposium on Computer architecture
Dynamically Controlled Resource Allocation in SMT Processors
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Learning-Based SMT Processor Resource Distribution via Hill-Climbing
Proceedings of the 33rd annual international symposium on Computer Architecture
A performance counter architecture for computing accurate CPI components
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
A Framework for Providing Quality of Service in Chip Multi-Processors
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Per-thread cycle accounting in SMT processors
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Evaluation techniques for storage hierarchies
IBM Systems Journal
Inferred Models for Dynamic and Sparse Hardware-Software Spaces
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
L1-bandwidth aware thread allocation in multicore SMT processors
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Racing and pacing to idle: an evaluation of heuristics for energy-aware resource allocation
Proceedings of the Workshop on Power-Aware Computing and Systems
A generalized software framework for accurate and efficient management of performance goals
Proceedings of the Eleventh ACM International Conference on Embedded Software
Hi-index | 0.00 |
Efficient on-chip resource management is crucial for Chip Multiprocessors (CMP) to achieve high resource utilization and enforce system-level performance objectives. Existing multiple resource management schemes either focus on intra-core resources or inter-core resources, missing the opportunity for exploiting the interaction between these two level resources. Moreover, these resource management schemes either rely on trial runs or complex on-line machine learning model to search for the appropriate resource allocation, which makes resource management inefficient and expensive. To address these limitations, this paper presents a predictive yet cost effective mechanism for multiple resource management in CMP. It uses a set of hardware-efficient online profilers and an analytical performance model to predict the application's performance with different intra-core and/or inter-core resource allocations. Based on the predicted performance, the resource allocator identifies and enforces near optimum resource partitions for each epoch without any trial runs. The experimental results show that the proposed predictive resource management framework could improve the weighted speedup of the CMP system by an average of 11.6% compared with the equal partition scheme, and 9.3% compared with existing reactive resource management scheme.