BLITZEN: a highly integrated massively parallel machine
Journal of Parallel and Distributed Computing - Massively parallel computation
Architectural tradeoffs in parallel computer design
Proceedings of the decennial Caltech conference on VLSI on Advanced research in VLSI
Unique design concepts on GF11 and their impact on performance
IBM Journal of Research and Development
Working sets, cache sizes, and node granularity issues for large-scale multiprocessors
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
The evaluation of massively parallel array architectures
The evaluation of massively parallel array architectures
Computer Architecture
The Burroughs Scientific Processor (BSP)
IEEE Transactions on Computers
Hi-index | 0.00 |
SIMD machines operate more efficiently on a wider range of problems when they have the ability to access memory with both global and local addresses. Recent work has made possible the use of caches for global addresses. This paper examines techniques for employing caches to improve memory accesses with local addresses. Specifically, we examine the improvement from utilizing a cluster-based indirect only cache. This is a simple extension of the direct only cache and yet improves indirect memory references significantly.