A bridging model for parallel computation
Communications of the ACM
Proceedings of the fifth MIT conference on Advanced research in VLSI
Journal of Computer and System Sciences
General purpose parallel architectures
Handbook of theoretical computer science (vol. A)
Reduction of network cost and wiring in Ranade's butterfly routing
Information Processing Letters
The MIT Alewife machine: architecture and performance
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
ICS '90 Proceedings of the 4th international conference on Supercomputing
Practical Pram Programming
The DASH Prototype: Logic Overhead and Performance
IEEE Transactions on Parallel and Distributed Systems
Realization of PRAMs: Processor Design
WDAG '94 Proceedings of the 8th International Workshop on Distributed Algorithms
Performance of MP3D on the SB-PRAM Prototype (Research Note)
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Operating system data structures for shared memory mimd machines with fetch-and-add
Operating system data structures for shared memory mimd machines with fetch-and-add
The NYU Ultracomputer Designing an MIMD Shared Memory Parallel Computer
IEEE Transactions on Computers
SaarCOR: a hardware architecture for ray tracing
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
A survey of processors with explicit multithreading
ACM Computing Surveys (CSUR)
Performance of MP3D on the SB-PRAM Prototype (Research Note)
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
A practical access to the theory of parallel algorithms
Proceedings of the 35th SIGCSE technical symposium on Computer science education
Algorithm engineering: bridging the gap between algorithm theory and practice
Algorithm engineering: bridging the gap between algorithm theory and practice
Improving Data Locality for Efficient In-Core Path Tracing
Computer Graphics Forum
Hi-index | 0.00 |
The SB-PRAM is a parallel architecture which uses i) multithreading in order to hide latency, ii) a pipelined combining butterfly network in order to reduce hot spots and iii) address hashing in order to randomize network traffic and to reduce memory module congestion. Previous work suggests that such a machine will efficiently simulate shared memory with constant access time independent of the number of processors (i.e. the theoretical PRAM model) provided enough threads can be kept busy. A prototype of a 64 processor SB-PRAM has been completed. We report some technical data about this prototype as well as performance measurements. On all benchmark programs measured so far the performance of the real machine was at most 1,37 % slower than predicted by simulations which assume perfect shared memory with uniform access time.