Fine-grain Priority Scheduling on Multi-channel Memory Systems

Authors:
Xiaodong
Affiliations:
-
Venue:
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Year:
2002

Citing 0
Cited 5

Design and Optimization of Large Size and Low Overhead Off-Chip Caches

IEEE Transactions on Computers
High-bandwidth network memory system through virtual pipelines

IEEE/ACM Transactions on Networking (TON)
Micro-pages: increasing DRAM efficiency with locality-aware data placement

Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Exploring the prefetcher/memory controller design space: an opportunistic prefetch scheduling strategy

ARCS'11 Proceedings of the 24th international conference on Architecture of computing systems
Architecture and optimal configuration of a real-time multi-channel memory controller

Proceedings of the Conference on Design, Automation and Test in Europe

Quantified Score

Hi-index	0.00

Visualization

Abstract

Configurations of contemporary DRAM memory systems become increasingly complex. A recent study shows that application performance is highly sensitive to choices of configurations, and suggests that tuning burst sizes and channel configurations be an effective way to optimize the DRAM performance for a given memory-intensive workload. However, this approach is workload dependent. In this study we show that, by utilizing fine-grain priority access scheduling, we are able to find a workload independent configuration that achieves optimal performance on a multi-channel memory system. Our approach can well utilize the available high concurrency and high bandwidth on such memory systems, and effectively reduce the memory stall time of memory-intensive applications. Conducting execution-driven simulation of a 4-way issue, 2 GHz processor, we show that the average performance improvement for fifteen memory-intensive SPEC2000 programs by using an optimized fine-grain priority scheduling is about 13% and 8% for a 2-channel and a 4-channel Direct Rambus DRAM memory systems, respectively, compared with gang scheduling. Compared with burst scheduling, the average performance improvement is 16% and 14% for the 2-channel and 4-channel memory systems, respectively.