Communicating sequential processes
Communicating sequential processes
on Parallel MIMD computation: HEP supercomputer and its applications
on Parallel MIMD computation: HEP supercomputer and its applications
Optimal pipelining in supercomputers
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Instruction issue logic for high-performance, interruptable pipelined processors
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Characterization of branch and data dependencies on programs for evaluating pipeline performance
IEEE Transactions on Computers
Measuring Parallelism in Computation-Intensive Scientific/Engineering Applications
IEEE Transactions on Computers
The design and analysis of parallel algorithms
The design and analysis of parallel algorithms
Available instruction-level parallelism for superscalar and superpipelined machines
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Limits on multiple instruction issue
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Super-scalar processor design
Computer architecture: a quantitative approach
Computer architecture: a quantitative approach
A bridging model for parallel computation
Communications of the ACM
Journal of Computer and System Sciences
IMPACT: an architectural framework for multiple-instruction-issue processors
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
An introduction to parallel algorithms
An introduction to parallel algorithms
Limits of control flow on parallelism
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Dynamic dependency analysis of ordinary programs
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
LogP: towards a realistic model of parallel computation
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Are multiport memories physically feasible?
ACM SIGARCH Computer Architecture News - Special issue on input/output in parallel computer systems
Evaluating Performance Tradeoffs Between Fine-Grained and Coarse-Grained Alternatives
IEEE Transactions on Parallel and Distributed Systems
Microprocessor Architectures: From VLIW to Tta
Microprocessor Architectures: From VLIW to Tta
Multithreaded Processor Design
Multithreaded Processor Design
Practical Pram Programming
Speculative Multithreaded Processors
Computer
Very Long Instruction Word architectures and the ELI-512
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Parallelism in random access machines
STOC '78 Proceedings of the tenth annual ACM symposium on Theory of computing
A parallel computer as a NOC region
Networks on chip
Superpipelined high-performance optical-flow computation architecture
Computer Vision and Image Understanding
Configurable emulated shared memory architecture for general purpose MP-SOCs and NOC regions
NOCS '09 Proceedings of the 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip
Reducing the associativity and size of step caches in CRCW operation
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Hi-index | 0.00 |
In this paper we try to conclude what kind of a computer architecture is efficient for executing sequential problems, and what kind of an architecture is efficient for executing parallel problems from the processor architect's point of view. For that purpose we analytically evaluate the performance of eight general purpose processor architectures representing widely both commercial and scientific processor designs in both single processor and multiprocessor setups. The results are interesting. The most efficient architecture for sequential problems is a two-level pipelined VLIW (very long instruction word) architecture with few parallel functional units. The most efficient architecture for parallel problems is a deeply inter-thread superpipelined architecture in which functional units are chained. Thus, designing a computer for efficient sequential computation leads to a very different architecture than designing one for efficient parallel computation and there exists no single optimal architecture for general purpose computation.