Technology and Design Tradeoffs in the Creation of a Modern Supercomputer
IEEE Transactions on Computers
The Burroughs Scientific Processor (BSP)
IEEE Transactions on Computers
The Prime Memory System for Array Access
IEEE Transactions on Computers
Analysis of Memory Interference in Multiprocessors
IEEE Transactions on Computers
The Organization and Use of Parallel Memories
IEEE Transactions on Computers
An Analysis of Vector Startup Access Delays
IEEE Transactions on Computers
Optimul: An optional interconnect for multiprocessor systems
ICS '88 Proceedings of the 2nd international conference on Supercomputing
Micro-analysis of the titans's operating pipe
ICS '88 Proceedings of the 2nd international conference on Supercomputing
Analysis of vector access performance on skewed interleaved memory
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Some results in memory conflict analysis
Proceedings of the 1989 ACM/IEEE conference on Supercomputing
Reducing memory contention in shared memory multiprocessors
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Scalar Memory References in Pipelined Multiprocessors: A Performance Study
IEEE Transactions on Software Engineering
IEEE Transactions on Computers
Memory contention for shared memory vector multiprocessors
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Performance prediction of parallel processing systems: the PAMELA methodology
ICS '93 Proceedings of the 7th international conference on Supercomputing
Synchronized access to streams in SIMD vector multiprocessors
ICS '94 Proceedings of the 8th international conference on Supercomputing
On Memory Contention Problems in Vector Multiprocessors
IEEE Transactions on Computers
Accounting for memory bank contention and delay in high-bandwidth multiprocessors
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Vector multiprocessors with arbitrated memory access
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Reducing inter-vector-conflicts in complex memory systems
ICS '96 Proceedings of the 10th international conference on Supercomputing
Accounting for Memory Bank Contention and Delay in High-Bandwidth Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Minimizing Conflicts Between Vector Streams in Interleaved Memory Systems
IEEE Transactions on Computers
Increasing the effective bandwidth of complex memory systems in multivector processors
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Evaluation of Neural and Genetic Algorithms for Synthesizing Parallel Storage Schemes
International Journal of Parallel Programming
High-Bandwidth Interleaved Memories for Vector Processors - A Simulation Study
IEEE Transactions on Computers
Reducing Interference Among Vector Accesses in Interleaved Memories
IEEE Transactions on Computers
Buffered Banks in Multiprocessor Systems
IEEE Transactions on Computers
Block, Multistride Vector, and FFT Accesses in Parallel Memory Systems
IEEE Transactions on Parallel and Distributed Systems
Performance Measurement Intrusion and Perturbation Analysis
IEEE Transactions on Parallel and Distributed Systems
Memory access reordering in vector processors
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Two widely-different architectural approaches to computer image generation
VIS '91 Proceedings of the 2nd conference on Visualization '91
Efficient address remapping in distributed shared-memory systems
ACM Transactions on Architecture and Code Optimization (TACO)
The design space of data-parallel memory systems
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
Practical aspects and experiences: CRAY X-MP and Y-MP memory performance
Parallel Computing
1991 International conference on supercomputing
Parallel Computing
SAMS multi-layout memory: providing multiple views of data to boost SIMD performance
Proceedings of the 24th ACM International Conference on Supercomputing
Design and implementation of Multistage Interconnection Networks using Quantum-dot Cellular Automata
Microelectronics Journal
A multi-level design methodology of multistage interconnection network for MPSOCs
International Journal of Computer Applications in Technology
Hi-index | 15.00 |
One of the significant differences between the CRAY X-MP and its predecessor, the CRAY-1S, is a considerably increased memory bandwidth for vector operations. Up to three vector streams in each of the two processors may be active simultaneously. These streams contend for memory banks as well as data paths. All memory conflicts are resolved dynamically by the memory system. This paper describes a simulation study of the CRAY X-MP interleaved memory system with attention focused on steady state performance for sequences of vector operations. Because it is more amenable to analysis, we first study the interaction of vector streams issued from a single processor. We identify the occurrence of linked conflicts, repeating sequences of conflicts between two or more vector streams that result in reduced steady state performance. Both worst case and average case performance measures are given. The discussion then turns to dual processor interactions. Finally, based on our simulations, possible modifications to the CRAY X-MP memory system are proposed and compared. These modifications are intended to eliminate or reduce the effects of linked conflicts.