The Manchester prototype dataflow computer
Communications of the ACM - Special section on computer architecture
Multithreading: a revisionist view of dataflow architectures
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
TAM—a compiler controlled threaded abstract machine
Journal of Parallel and Distributed Computing - Special issue on dataflow and multithreaded architectures
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Simultaneous multithreading: maximizing on-chip parallelism
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
A design study of the EARTH multiprocessor
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
The case for a single-chip multiprocessor
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Monsoon: an explicit token-store architecture
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
Scheduled Dataflow: Execution Paradigm, Architecture, and Performance Evaluation
IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
Area and System Clock Effects on SMT/CMP Processors
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Thread Synchronization Unit (TSU): A Building Block for High Performance Computers
ISHPC '97 Proceedings of the International Symposium on High Performance Computing
Two Fundamental Limits on Dataflow Multiprocessing
PACT '93 Proceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism
Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture
Proceedings of the 30th annual international symposium on Computer architecture
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Designing real-time H.264 decoders with dataflow architectures
CODES+ISSS '05 Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Proceedings of the 33rd annual international symposium on Computer Architecture
Area-Performance Trade-offs in Tiled Dataflow Architectures
Proceedings of the 33rd annual international symposium on Computer Architecture
Data-Driven Multithreading Using Conventional Microprocessors
IEEE Transactions on Parallel and Distributed Systems
A case for chip multiprocessors based on the data-driven multithreading model
International Journal of Parallel Programming
Computer
Computer
Communication assist for data driven multithreading
PCI'01 Proceedings of the 8th Panhellenic conference on Informatics
DDM-CMP: data-driven multithreading on a chip multiprocessor
SAMOS'05 Proceedings of the 5th international conference on Embedded Computer Systems: architectures, Modeling, and Simulation
The shared-thread multiprocessor
Proceedings of the 22nd annual international conference on Supercomputing
Hardware support for fine-grained event-driven computation in Anton 2
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Hi-index | 0.00 |
Although the dataflow model of execution, with its obvious benefits, has been proposed for a long time, it has not yet been successfully exploited. Nevertheless, as traditional systems have recently started to reach their limits in delivering higher performance, new models of execution that use dataflow-like concepts are being studied. Among these, Data-Driven Multithreading (DDM) is a multithreading model that effectively hides the communication delay and synchronisation overheads. In DDM threads are scheduled as soon as their input data has been produced, that is, in a dataflow-like way. In addition to presenting a motivation to the dataflow model of execution, this paper also presents an overview of the DDM project. In particular, it focuses on the Chip Multiprocessor (CMP) implementation using the DDM model, its hardware, run-time system and performance evaluation. The DDM-CMP inherits the benefits of both the DDM model which allows to overcome the memory wall limitation and the CMP which offers a simpler design, higher degree of parallelism and larger power-performance efficiency, therefore overcoming the power wall. Preliminary experimental results show a significant benefit in terms of both speedup and power consumption, making the DDM-CMP architecture an attractive architecture for future processors.