ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Memory controller policies for DRAM power management
ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
Scheduler-based DRAM energy management
Proceedings of the 39th annual Design Automation Conference
DRAM Energy Management Using Sof ware and Hardware Directed Power Mode Control
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Dynamic tracking of page miss ratio curve for memory management
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Improving energy efficiency by making DRAM less randomly accessed
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Queue - Multiprocessors
DRAMsim: a memory system simulator
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Computer Architecture, Fourth Edition: A Quantitative Approach
Computer Architecture, Fourth Edition: A Quantitative Approach
Design and implementation of power-aware virtual memory
ATEC '03 Proceedings of the annual conference on USENIX Annual Technical Conference
Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Stall-Time Fair Memory Access Scheduling for Chip Multiprocessors
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Corona: System Implications of Emerging Nanophotonic Technology
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Understanding and Designing New Server Architectures for Emerging Warehouse-Computing Environments
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
3D-Stacked Memory Architectures for Multi-core Processors
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Self-Optimizing Memory Controllers: A Reinforcement Learning Approach
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
PowerNap: eliminating server idle power
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Prefetch-Aware DRAM Controllers
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Mini-rank: Adaptive DRAM architecture for improving memory power efficiency
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Memory Systems: Cache, DRAM, Disk
Memory Systems: Cache, DRAM, Disk
DRAM errors in the wild: a large-scale field study
Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Disaggregated memory for expansion and sharing in blade servers
Proceedings of the 36th annual international symposium on Computer architecture
Scaling the bandwidth wall: challenges in and avenues for CMP scaling
Proceedings of the 36th annual international symposium on Computer architecture
Multicore DIMM: an Energy Efficient Memory Module with Independently Controlled DRAMs
IEEE Computer Architecture Letters
The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
Future scaling of processor-memory interfaces
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Micro-pages: increasing DRAM efficiency with locality-aware data placement
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Virtualized and flexible ECC for main memory
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Re-architecting DRAM memory systems with monolithically integrated silicon photonics
Proceedings of the 37th annual international symposium on Computer architecture
Re-architecting DRAM memory systems with monolithically integrated silicon photonics
Proceedings of the 37th annual international symposium on Computer architecture
Understanding the Energy Consumption of Dynamic Random Access Memories
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
MemScale: active low-power modes for main memory
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Proceedings of the 38th annual international symposium on Computer architecture
System implications of memory reliability in exascale computing
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Improving System Energy Efficiency with Memory Rank Subsetting
ACM Transactions on Architecture and Code Optimization (TACO)
Multiple sub-row buffers in DRAM: unlocking performance and energy improvement opportunities
Proceedings of the 26th ACM international conference on Supercomputing
MultiScale: memory system DVFS with multiple memory controllers
Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design
BOOM: enabling mobile memory based low-power server DIMMs
Proceedings of the 39th Annual International Symposium on Computer Architecture
Towards energy-proportional datacenter memory with mobile DRAM
Proceedings of the 39th Annual International Symposium on Computer Architecture
LOT-ECC: localized and tiered reliability mechanisms for commodity memory systems
Proceedings of the 39th Annual International Symposium on Computer Architecture
A case for exploiting subarray-level parallelism (SALP) in DRAM
Proceedings of the 39th Annual International Symposium on Computer Architecture
A software memory partition approach for eliminating bank-level interference in multicore systems
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
RAMZzz: rank-aware dram power management with dynamic migrations and demotions
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
A study of DRAM failures in the field
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
A survey of architectural techniques for DRAM power management
International Journal of High Performance Systems Architecture
Asymmetric DRAM synthesis for heterogeneous chip multiprocessors in 3D-stacked architecture
Proceedings of the International Conference on Computer-Aided Design
Leveraging Heterogeneity in DRAM Main Memories to Accelerate Critical Word Access
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Rethinking DRAM Power Modes for Energy Proportionality
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Conservative row activation to improve memory power efficiency
Proceedings of the 27th international ACM conference on International conference on supercomputing
Reducing memory access latency with asymmetric DRAM bank organizations
Proceedings of the 40th Annual International Symposium on Computer Architecture
Resilient die-stacked DRAM caches
Proceedings of the 40th Annual International Symposium on Computer Architecture
Pragmatic integration of an SRAM row cache in heterogeneous 3-D DRAM architecture using TSV
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Exploring DRAM organizations for energy-efficient and resilient exascale memories
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Low-power, low-storage-overhead chipkill correct via multi-line error correction
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Coordinate page allocation and thread group for improving main memory power efficiency
Proceedings of the Workshop on Power-Aware Computing and Systems
RowClone: fast and energy-efficient in-DRAM bulk data copy and initialization
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
A circuit-architecture co-optimization framework for exploring nonvolatile memory hierarchies
ACM Transactions on Architecture and Code Optimization (TACO)
ACM Transactions on Architecture and Code Optimization (TACO)
Reducing DRAM row activations with eager read/write clustering
ACM Transactions on Architecture and Code Optimization (TACO)
Unleashing the potential of MLC STT-RAM caches
Proceedings of the International Conference on Computer-Aided Design
Direct distributed memory access for CMPs
Journal of Parallel and Distributed Computing
Refresh pausing in DRAM memory systems
ACM Transactions on Architecture and Code Optimization (TACO)
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
DRAM vendors have traditionally optimized the cost-per-bit metric, often making design decisions that incur energy penalties. A prime example is the overfetch feature in DRAM, where a single request activates thousands of bit-lines in many DRAM chips, only to return a single cache line to the CPU. The focus on cost-per-bit is questionable in modern-day servers where operating costs can easily exceed the purchase cost. Modern technology trends are also placing very different demands on the memory system: (i)queuing delays are a significant component of memory access time, (ii) there is a high energy premium for the level of reliability expected for business-critical computing, and (iii) the memory access stream emerging from multi-core systems exhibits limited locality. All of these trends necessitate an overhaul of DRAM architecture, even if it means a slight compromise in the cost-per-bit metric. This paper examines three primary innovations. The first is a modification to DRAM chip microarchitecture that re tains the traditional DDRx SDRAMinterface. Selective Bit-line Activation (SBA) waits for both RAS (row address) and CAS (column address) signals to arrive before activating exactly those bitlines that provide the requested cache line. SBA reduces energy consumption while incurring slight area and performance penalties. The second innovation, Single Subarray Access (SSA), fundamentally re-organizes the layout of DRAM arrays and the mapping of data to these arrays so that an entire cache line is fetched from a single subarray. It requires a different interface to the memory controller, reduces dynamic and background energy (by about 6X), incurs a slight area penalty (4%), and can even lead to performance improvements (54% on average) by reducing queuing delays. The third innovation further penalizes the cost-per-bit metric by adding a checksum feature to each cache line. This checksum error-detection feature can then be used to build stronger RAID-like fault tolerance, including chipkill-level reliability. Such a technique is especially crucial for the SSA architecture where the entire cache line is localized to a single chip. This DRAM chip microarchitectural change leads to a dramatic reduction in the energy and storage overheads for reliability. The proposed architectures will also apply to other emerging memory technologies (such as resistive memories) and will be less disruptive to standards, interfaces, and the design flow if they can be incorporated into first-generation designs.