DRAMsim: a memory system simulator
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
The M5 Simulator: Modeling Networked Systems
IEEE Micro
Architecting phase change memory as a scalable dram alternative
Proceedings of the 36th annual international symposium on Computer architecture
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Energy- and endurance-aware design of phase change memory caches
Proceedings of the Conference on Design, Automation and Test in Europe
A morphable phase change memory architecture considering frequent zero values
ICCD '11 Proceedings of the 2011 IEEE 29th International Conference on Computer Design
Preventing PCM banks from seizing too much power
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Improving write operations in MLC phase change memory
HPCA '12 Proceedings of the 2012 IEEE 18th International Symposium on High-Performance Computer Architecture
A case for exploiting subarray-level parallelism (SALP) in DRAM
Proceedings of the 39th Annual International Symposium on Computer Architecture
Hi-index | 0.00 |
Enabling subarrays reduces memory latency by allowing concurrent accesses to different subarrays within the same bank in the DRAM system. However, this technology has great challenges in the PCM system since an on-going write cannot overlap with other accesses due to large electric current draw for writes. This paper proposes two new mechanisms (PASAK and WAVAK) that leverage subarray-level parallelism to enable a bank to serve a write and multiple reads in parallel without violating power constraints. PASAK exploits the electric current difference between writing a bit 0 and a bit 1, and provides a new power allocation strategy that better utilizes the power budget to mitigate the performance degradation due to bank conflicts. WAVAK adds a simple coding method that inverts all bits to be written if there are more zeros than ones, with a goal to reduce electric current for writes and create larger power surplus to serve more reads if there is no subarray conflict. Experimental results under 4-cores SPEC CPU 2006 workloads show that our proposed mechanisms can reduce memory latency by 68.7% and running time by 34.8% on average, comparing with the standard PCM system. In addition, our mechanisms outperform Flip-N-Write 14.6% in latency and 8.5% in running time on average.