A Simulation Study of the CRAY X-MP Memory System
IEEE Transactions on Computers
Vector Computer Memory Bank Contention
IEEE Transactions on Computers
Vector access performance in parallel memories using skewed storage scheme
IEEE Transactions on Computers
Scrambled storage for parallel memory systems
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
The Cray X-MP/model 24: a case study in pipelined architecture and vector processing
The Cray X-MP/model 24: a case study in pipelined architecture and vector processing
Address Tracing for Parallel Machines
Computer - Special issue on experimental research in computer architecture
Accurate modelling of interconnection networks in vector supercomputers
ICS '91 Proceedings of the 5th international conference on Supercomputing
Relationship between average and real memory behavior
The Journal of Supercomputing
High-Bandwidth Interleaved Memories for Vector Processors - A Simulation Study
IEEE Transactions on Computers
Effects of buffered memory requests in multiprocessor systems
SIGMETRICS '79 Proceedings of the 1979 ACM SIGMETRICS conference on Simulation, measurement and modeling of computer systems
Data caches for superscalar processors
ICS '97 Proceedings of the 11th international conference on Supercomputing
Minimizing Conflicts Between Vector Streams in Interleaved Memory Systems
IEEE Transactions on Computers
Co-design of interleaved memory systems
CODES '00 Proceedings of the eighth international workshop on Hardware/software codesign
Increasing the effective bandwidth of complex memory systems in multivector processors
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Hi-index | 14.98 |
A memory design based on logical banks is analyzed for shared memory multiprocessor systems. In this design, each physical bank is replaced by a logical bank consisting of a fast register and subbanks of slower memory. The subbanks are buffered by input and output queues which substantially reduce the effective cycle time when the reference rate is below saturation. The principal contribution of this work is the development of a simple analytical model which leads to scaling relationships among the efficiency, the bank cycle time, the number of processors, the size of the buffers, and the granularity of the banks. These scaling relationships imply that if the interconnection network has sufficient bandwidth to support efficient access using high-speed memory, then lower-speed memory can be substituted with little additional interconnection cost. The scaling relationships are shown to hold for a full datapath vector simulation based on the Cray Y-MP architecture. The model is used to develop design criteria for a system which supports 192 independent reference streams, and the performance of this system is evaluated by simulation over a range of loading conditions.