Architectural differences of efficient sequential and parallel computers

Authors:
Martti J. Forsell
Affiliations:
VTT Electronics, PB 1100, FIN-90571 Oulu, Finland
Venue:
Journal of Systems Architecture: the EUROMICRO Journal
Year:
2002

Citing 27
Cited 4

Communicating sequential processes

Communicating sequential processes
on Parallel MIMD computation: HEP supercomputer and its applications

on Parallel MIMD computation: HEP supercomputer and its applications
Optimal pipelining in supercomputers

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Instruction issue logic for high-performance, interruptable pipelined processors

ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Characterization of branch and data dependencies on programs for evaluating pipeline performance

IEEE Transactions on Computers
Measuring Parallelism in Computation-Intensive Scientific/Engineering Applications

IEEE Transactions on Computers
The design and analysis of parallel algorithms

The design and analysis of parallel algorithms
Available instruction-level parallelism for superscalar and superpipelined machines

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Limits on multiple instruction issue

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Super-scalar processor design

Super-scalar processor design
Computer architecture: a quantitative approach

Computer architecture: a quantitative approach
A bridging model for parallel computation

Communications of the ACM
How to emulate shared memory

Journal of Computer and System Sciences
IMPACT: an architectural framework for multiple-instruction-issue processors

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
An introduction to parallel algorithms

An introduction to parallel algorithms
Limits of control flow on parallelism

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Dynamic dependency analysis of ordinary programs

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
LogP: towards a realistic model of parallel computation

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Are multiport memories physically feasible?

ACM SIGARCH Computer Architecture News - Special issue on input/output in parallel computer systems
Evaluating Performance Tradeoffs Between Fine-Grained and Coarse-Grained Alternatives

IEEE Transactions on Parallel and Distributed Systems
Microprocessor Architectures: From VLIW to Tta

Microprocessor Architectures: From VLIW to Tta
Multithreaded Processor Design

Multithreaded Processor Design
Practical Pram Programming

Practical Pram Programming
Instruction-Level Distributed Processing

Computer
Speculative Multithreaded Processors

Computer
Very Long Instruction Word architectures and the ELI-512

ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Parallelism in random access machines

STOC '78 Proceedings of the tenth annual ACM symposium on Theory of computing

A parallel computer as a NOC region

Networks on chip
Superpipelined high-performance optical-flow computation architecture

Computer Vision and Image Understanding
Configurable emulated shared memory architecture for general purpose MP-SOCs and NOC regions

NOCS '09 Proceedings of the 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip
Reducing the associativity and size of step caches in CRCW operation

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we try to conclude what kind of a computer architecture is efficient for executing sequential problems, and what kind of an architecture is efficient for executing parallel problems from the processor architect's point of view. For that purpose we analytically evaluate the performance of eight general purpose processor architectures representing widely both commercial and scientific processor designs in both single processor and multiprocessor setups. The results are interesting. The most efficient architecture for sequential problems is a two-level pipelined VLIW (very long instruction word) architecture with few parallel functional units. The most efficient architecture for parallel problems is a deeply inter-thread superpipelined architecture in which functional units are chained. Thus, designing a computer for efficient sequential computation leads to a very different architecture than designing one for efficient parallel computation and there exists no single optimal architecture for general purpose computation.