Design principles for a virtual multiprocessor

Authors:
Philip Machanick
Affiliations:
University of Queensland, Qld, Australia
Venue:
Proceedings of the 2007 annual research conference of the South African institute of computer scientists and information technologists on IT research in developing countries
Year:
2007

Citing 21
Cited 0

The case for a single-chip multiprocessor

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
The multicluster architecture: reducing cycle time through partitioning

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Unified assign and schedule: a new approach to scheduling for clustered register file microarchitectures

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
An interleaved cache clustered VLIW processor

ICS '02 Proceedings of the 16th international conference on Supercomputing
Modulo scheduling with integrated register spilling for clustered VLIW architectures

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
The Alpha 21264 Microprocessor

IEEE Micro
The Stanford Hydra CMP

IEEE Micro
A reconfigurable unit for a clustered programmable-reconfigurable processor

FPGA '04 Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
Application adaptive energy efficient clustered architectures

Proceedings of the 2004 international symposium on Low power electronics and design
Hardware/Software Interface Codesign for Embedded Systems

Computer
An Instruction-Level Distributed Processor for Symmetric-Key Cryptography

IEEE Transactions on Parallel and Distributed Systems
Configuration Steering for a Reconfigurable Superscalar Processor

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 3 - Volume 04
An asymmetric clustered processor based on value content

Proceedings of the 19th annual international conference on Supercomputing
A Criticality Analysis of Clustering in Superscalar Processors

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Processor Power Reduction Via Single-ISA Heterogeneous Multi-Core Architectures

IEEE Computer Architecture Letters
Register aware scheduling for distributed cache clustered architecture

ASP-DAC '03 Proceedings of the 2003 Asia and South Pacific Design Automation Conference
DynaCORE — A Dynamically Reconfigurable Coprocessor Architecture for Network Processors

PDP '06 Proceedings of the 14th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing
POWER5 System microarchitecture

IBM Journal of Research and Development - POWER5 and packaging
Introduction to the cell multiprocessor

IBM Journal of Research and Development - POWER5 and packaging
PicoServer: using 3D stacking technology to enable a compact energy efficient chip multiprocessor

Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
POWER4 system microarchitecture

IBM Journal of Research and Development

Quantified Score

Hi-index	0.00

Visualization

Abstract

The case for chip multiprocessor (CMP) or multicore designs is strong, and increasingly accepted as evidenced by the growing number of commercial multicore designs. However, there is also some evidence that the quest for instruction-level parallelism, like the Monty Python parrot, is not dead but resting. The cases for CMP and ILP are complementary. A multitasking or multithreaded workload will do better on a CMP design; a floating-point application without many decision points will do better on a machine with ILP as its main parallelism. This paper explores a model for achieving both in the same design, by reconfiguring functional units on the fly. The result is a virtual multiprocessor (or vMP) which at the software level looks like either a uniprocessor with n clusters of functional units, or an n-core CMP, depending on how the data path is configured.