Memory systems in the many-core era: challenges, opportunities, and solution directions

Authors:
Onur Mutlu
Affiliations:
Carnegie Mellon University, Pittsburgh, PA, USA
Venue:
Proceedings of the international symposium on Memory management
Year:
2011

Citing 13
Cited 1

Energy Management for Commercial Servers

Computer
Stall-Time Fair Memory Access Scheduling for Chip Multiprocessors

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Memory performance attacks: denial of memory service in multi-core systems

SS'07 Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium
Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems

ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Distributed order scheduling and its application to multi-core dram controllers

Proceedings of the twenty-seventh ACM symposium on Principles of distributed computing
Architecting phase change memory as a scalable dram alternative

Proceedings of the 36th annual international symposium on Computer architecture
Scalable high performance main memory system using phase-change memory technology

Proceedings of the 36th annual international symposium on Computer architecture
Application-aware prioritization mechanisms for on-chip networks

Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Fairness via source throttling: a configurable and high-performance fairness substrate for multi-core memory systems

Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Aérgia: exploiting packet latency slack in on-chip networks

Proceedings of the 37th annual international symposium on Computer architecture
Next generation on-chip networks: what kind of congestion control do we need?

Hotnets-IX Proceedings of the 9th ACM SIGCOMM Workshop on Hot Topics in Networks
Thread Cluster Memory Scheduling: Exploiting Differences in Memory Access Behavior

MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Prefetch-aware shared resource management for multi-core systems

Proceedings of the 38th annual international symposium on Computer architecture

On-chip traffic regulation to reduce coherence protocol cost on a microthreaded many-core architecture with distributed caches

ACM Transactions on Embedded Computing Systems (TECS) - Special Issue on Design Challenges for Many-Core Processors, Special Section on ESTIMedia'13 and Regular Papers

Quantified Score

Hi-index	0.00

Visualization

Abstract

The memory subsystem is a fundamental performance and energy bottleneck in almost all computing systems. Recent trends towards increasingly more cores on die, consolidation of diverse workloads on a single chip, and difficulty of DRAM scaling impose new requirements and exacerbate old demands on the memory system. In particular, the need for memory bandwidth and capacity is increasing [14], applications' interference in memory system increasingly limits system performance and makes the system hard to control [12], memory energy and power are key design concerns [8], and DRAM technology consumes significant amount of energy and does not scale down easily to smaller technology nodes [7]. Fortunately, some promising solution directions exist. In this talk, we will examine recent technology, application, and architecture trends motivating a fundamental rethinking of the memory hierarchy. Based on this motivation, we will describe requirements from an ideal memory system suitable for the many-core era. The talk will examine questions one would need to answer in approximating the ideal memory system and possible avenues that seem promising for the research community to explore. In particular, we will focus on the problem of uncontrolled inter-application interference in the memory system and draw upon our experiences in solving it by designing quality-of-service (QoS) aware memory controllers [5, 6, 9, 10, 11, 12], interconnects [1 2 13], and entire memory systems. We will make a case forapplication- and QoS-aware design of memory systems and [3, 4]integrated/cooperative design of cores, interconnects, and memory components to optimize the overall system.