Memory requirements for balanced computer architectures

Authors:
H. T. Kung
Affiliations:
Department of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania
Venue:
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Year:
1986

Citing 5
Cited 20

Warp architecture and implementation

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Computer Structures: Principles and Examples

Computer Structures: Principles and Examples
Structure of Computers and Computations

Structure of Computers and Computations
I/O complexity: The red-blue pebble game

STOC '81 Proceedings of the thirteenth annual ACM symposium on Theory of computing
On a high-performance vlsi solution to database problems

On a high-performance vlsi solution to database problems

Warp architecture and implementation

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
A study parallel disk organizations

ACM SIGARCH Computer Architecture News
Interprocessor communication speed and performance in distributed-memory parallel processors

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
An Evaluation of Multiple-Disk I/O Systems

IEEE Transactions on Computers
On the Design of a Unidirectional Systolic Array for Key Enumeration

IEEE Transactions on Computers
Software and hardware parallelism on the iWarp multi-computer

ICS '91 Proceedings of the 5th international conference on Supercomputing
The K2 distributed memory parallel processor: architecture, compiler, and operating system

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Working sets, cache sizes, and node granularity issues for large-scale multiprocessors

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Future applicability of bus-based shared memory multiprocessors

Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Warp architecture and implementation

25 years of the international symposia on Computer architecture (selected papers)
The K2 parallel processor: architecture and hardware implementation

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
A study of I/O behavior of perfect benchmarks on a multiprocessor

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
SimpleFit: A Framework for Analyzing Design Trade-Offs in Raw Architectures

IEEE Transactions on Parallel and Distributed Systems
Design, Analysis, and Simulation of I/O Architectures for Hypercube Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
Efficient Placement of Parity and Data to Tolerate Two Disk Failures in Disk Array Systems

IEEE Transactions on Parallel and Distributed Systems
Householder Bidiagonalization on Parallel Computers with Dynamic Ring Architecture

PAS '97 Proceedings of the 2nd AIZU International Symposium on Parallel Algorithms / Architecture Synthesis
Optimal geometric algorithms for digitized images on fixed-size linear arrays and scan-line arrays

Distributed Computing
Paper: Program compression on the instruction systolic array

Parallel Computing
Balance principles for algorithm-architecture co-design

HotPar'11 Proceedings of the 3rd USENIX conference on Hot topic in parallelism
On the communication complexity of 3D FFTs and its implications for Exascale

Proceedings of the 26th ACM international conference on Supercomputing

Quantified Score

Hi-index	0.01

Visualization

Abstract

One particular result is that to balance an array of p linearly connected PEs for performing matrix computations such as matrix multiplication and matrix triangularization, the size of each PE's local memory must grow linearly with p. Thus, the larger the array is, the larger each PE's local memory must be.