An Adaptation of the Fast Fourier Transform for Parallel Processing
Journal of the ACM (JACM)
Practical Parallel Band Triangular System Solvers
ACM Transactions on Mathematical Software (TOMS)
A Survey of Parallel Machine Organization and Programming
ACM Computing Surveys (CSUR)
Communications of the ACM - Special issue on computer architecture
Structure of Computers and Computations
Structure of Computers and Computations
Design of a Computer—The Control Data 6600
Design of a Computer—The Control Data 6600
The Prime Memory System for Array Access
IEEE Transactions on Computers
Time and Parallel Processor Bounds for Linear Recurrence Systems
IEEE Transactions on Computers
IEEE Transactions on Computers
ILLIAC IV Software and Application Programming
IEEE Transactions on Computers
Time and Parallel Processor Bounds for Fortran-Like Loops
IEEE Transactions on Computers
IEEE Transactions on Computers
The Organization and Use of Parallel Memories
IEEE Transactions on Computers
Code generation for PIE (Parallel Instruction Execution) computers
AFIPS '67 (Spring) Proceedings of the April 18-20, 1967, spring joint computer conference
Sorting networks and their applications
AFIPS '68 (Spring) Proceedings of the April 30--May 2, 1968, spring joint computer conference
The TI ASC: a highly modular and flexible super computer architecture
AFIPS '72 (Fall, part I) Proceedings of the December 5-7, 1972, fall joint computer conference, part I
A production implementation of an associative array processor: STARAN
AFIPS '72 (Fall, part I) Proceedings of the December 5-7, 1972, fall joint computer conference, part I
An efficient algorithm for exploiting multiple arithmetic units
IBM Journal of Research and Development
An analysis of floating-point addition
IBM Systems Journal
A Simulation Study of the CRAY X-MP Memory System
IEEE Transactions on Computers
Advanced compiler optimizations for supercomputers
Communications of the ACM - Special issue on parallelism
A new interconnection network for SIMD computers: the sigma networks
IEEE Transactions on Computers
Improving Memory Performance for Indirect Accesses on SIMD Computers
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Memory access reordering in vector processors
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Supersystems: Current State-of-the-Art Guest Editor's Introduction
IEEE Transactions on Computers
The Prime Memory System for Array Access
IEEE Transactions on Computers
Graph Theoretical Analysis and Design of Multistage Interconnection Networks
IEEE Transactions on Computers
An interleaved array-processing architecture
AFIPS '84 Proceedings of the July 9-12, 1984, national computer conference and exposition
Scalability evaluation of a polymorphic register file: A CG case study
ARCS'11 Proceedings of the 24th international conference on Architecture of computing systems
Separable 2d convolution with polymorphic register files
ARCS'13 Proceedings of the 26th international conference on Architecture of Computing Systems
Hi-index | 15.00 |
The Burroughs Scientific Processor (BSP), a high-performance computer system, performed the Department of Energy LLL loops at roughly the speed of the CRAY-1. The BSP combined parallelism and pipelining, performing memory-to-memory operations. Seventeen memory units and two crossbar switch data alignment networks provided conflict-free access to most indexed arrays. Fast linear recurrence algorithms provided good performance on constructs that some machines execute serially. A system manager computer ran the operating system and a vectorizing Fortran compiler. An MOS file memory system served as a high bandwidth secondary memory.