Conflict-free access of vectors with power-of-two strides
ICS '92 Proceedings of the 6th international conference on Supercomputing
Eliminating cache conflict misses through XOR-based placement functions
ICS '97 Proceedings of the 11th international conference on Supercomputing
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Frequent value locality and its applications
ACM Transactions on Embedded Computing Systems (TECS)
Accurate and Complexity-Effective Spatial Pattern Prediction
HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
Proceedings of the 33rd annual international symposium on Computer Architecture
Architecting phase change memory as a scalable dram alternative
Proceedings of the 36th annual international symposium on Computer architecture
A durable and energy efficient main memory using phase change memory technology
Proceedings of the 36th annual international symposium on Computer architecture
Data manipulation techniques to reduce phase change memory write energy
Proceedings of the 14th ACM/IEEE international symposium on Low power electronics and design
A class of optimal minimum odd-weight-column SEC-DED codes
IBM Journal of Research and Development
Characterizing and mitigating the impact of process variations on phase change based memory systems
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Enhancing lifetime and security of PCM-based main memory with start-gap wear leveling
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Dynamically replicated memory: building reliable systems from nanoscale resistive memories
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Use ECP, not ECC, for hard failures in resistive memories
Proceedings of the 37th annual international symposium on Computer architecture
Morphable memory system: a robust architecture for exploiting multi-level phase change memories
Proceedings of the 37th annual international symposium on Computer architecture
Proceedings of the 37th annual international symposium on Computer architecture
FREE-p: Protecting non-volatile memory against both hard and soft errors
HPCA '11 Proceedings of the 2011 IEEE 17th International Symposium on High Performance Computer Architecture
LLS: Cooperative integration of wear-leveling and salvaging for PCM main memory
DSN '11 Proceedings of the 2011 IEEE/IFIP 41st International Conference on Dependable Systems&Networks
Energy-efficient multi-level cell phase-change memory system with data encoding
ICCD '11 Proceedings of the 2011 IEEE 29th International Conference on Computer Design
Preventing PCM banks from seizing too much power
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Pay-As-You-Go: low-overhead hard-error correction for phase change memories
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Improving write operations in MLC phase change memory
HPCA '12 Proceedings of the 2012 IEEE 18th International Symposium on High-Performance Computer Architecture
Quantifying the effectiveness of load balance algorithms
Proceedings of the 26th ACM international conference on Supercomputing
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Accelerating write by exploiting PCM asymmetries
HPCA '13 Proceedings of the 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA)
Hi-index | 0.00 |
Write bandwidth is an inherent performance bottleneck for Phase Change Memory (PCM) for two reasons. First, PCM cells have long programming time, and second, only a limited number of PCM cells can be programmed concurrently due to programming current and write circuit constraints, For each PCM write, the data bits of the write request are typically mapped to multiple cell groups and processed in parallel. We observed that an unbalanced distribution of modified data bits among cell groups significantly increases PCM write time and hurts effective write bandwidth. To address this issue, we first uncover the cyclical and cluster patterns for modified data bits. Next, we propose double XOR mapping (D-XOR) to distribute modified data bits among cell groups in a balanced way. D-XOR can reduce PCM write service time by 45% on average, which increases PCM write throughput by 1.8x. As error correction (redundant bits) is critical for PCM, we also consider the impact of redundancy information in mapping data and error correction bits to cell groups. Our techniques lead to a 51% average reduction in write service time for a PCM main memory with ECC, which increases IPC by 12%.