Performance of cached DRAM organizations in vector supercomputers
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
The impact of architectural trends on operating system performance
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Evaluation of design alternatives for a multiprocessor microprocessor
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Memory bandwidth limitations of future microprocessors
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Missing the memory wall: the case for processor/memory integration
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Design and evaluation of dynamic access ordering hardware
ICS '96 Proceedings of the 10th international conference on Supercomputing
Memory system characterization of commercial workloads
Proceedings of the 25th annual international symposium on Computer architecture
Performance characterization of a Quad Pentium Pro SMP using OLTP workloads
Proceedings of the 25th annual international symposium on Computer architecture
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
Performance of database workloads on shared-memory systems with out-of-order processors
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
A performance comparison of contemporary DRAM architectures
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
New DRAM Technologies: A Comprehensive Analysis of the New Architecture
New DRAM Technologies: A Comprehensive Analysis of the New Architecture
The use of static column ram as a memory hierarchy
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Access ordering and memory-conscious cache utilization
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Performance Characterization of the Alpha 21164 Microprocessor Using TP and SPEC Workloads
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
The impact of shared-cache clustering in small-scale shared-memory multiprocessors
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Performance Characterization of the Pentium® Pro Processor
HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture
Modern dram architectures
Transparent data-memory organizations for digital signal processors
CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
Sourcebook of parallel computing
A study of performance impact of memory controller features in multi-processor server environment
WMPI '04 Proceedings of the 3rd workshop on Memory performance issues: in conjunction with the 31st international symposium on computer architecture
Spectral prefetcher: An effective mechanism for L2 cache prefetching
ACM Transactions on Architecture and Code Optimization (TACO)
Intelligent memory manager: reducing cache pollution due to memory management functions
Journal of Systems Architecture: the EUROMICRO Journal
A low-cost memory remapping scheme for address bus protection
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
A high-end real-time digital film processing reconfigurable platform
EURASIP Journal on Embedded Systems
A low-cost memory remapping scheme for address bus protection
Journal of Parallel and Distributed Computing
Memory access schedule minimization for embedded systems
Journal of Systems Architecture: the EUROMICRO Journal
Rank idle time prediction driven last-level cache writeback
Proceedings of the 2012 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness
Improving writeback efficiency with decoupled last-write prediction
Proceedings of the 39th Annual International Symposium on Computer Architecture
Buffer-on-board memory systems
Proceedings of the 39th Annual International Symposium on Computer Architecture
Hi-index | 14.98 |
This paper presents a simulation-based performance study of several of the new high-performance DRAM architectures, each evaluated in a small system organization. These small-system organizations correspond to workstation-class computers and use only a handful of DRAM chips (~10, as opposed to ~1 or ~100). The study covers Fast Page Mode, Extended Data Out, Synchronous, Enhanced Synchronous, Double Data Rate, Synchronous Link, Rambus, and Direct Rambus designs. Our simulations reveal several things: 1) Current advanced DRAM technologies are attacking the memory bandwidth problem but not the latency problem; 2) bus transmission speed will soon become a primary factor limiting memory-system performance; 3) the post-L2 address stream still contains significant locality, though it varies from application to application; 4) systems without L2 caches are feasible for low- and medium-speed CPUs (1GHz and below); and 5) as we move to wider buses, row access time becomes more prominent, making it important to investigate techniques to exploit the available locality to decrease access time.