TAM—a compiler controlled threaded abstract machine
Journal of Parallel and Distributed Computing - Special issue on dataflow and multithreaded architectures
A design study of the EARTH multiprocessor
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Network Interface for a Data Driven Network of Workstations (D2NOW)
ISHPC '99 Proceedings of the Second International Symposium on High Performance Computing
Telegraphos: High-Performance Networking for Parallel Processing on Workstation Clusters
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
The StarT-Voyager Parallel System
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Data-Driven Multithreading Using Conventional Microprocessors
IEEE Transactions on Parallel and Distributed Systems
Chip multiprocessor based on data-driven multithreading model
International Journal of High Performance Systems Architecture
Hi-index | 0.00 |
Latency tolerance is one of the main concerns in parallel processing. Data Driven Multithreading, a technique that uses extra hardware to schedule threads for execution based on data availability, allows for better performance, through latency tolerance. With Data Driven Multithreading a thread is scheduled for execution only if all of its inputs have been produced and placed in the processor's local memory. Communication and synchronization are decoupled from the computation portions of a program, i.e. they execute asynchronously. Thus, no synchronization or communication latencies will be experienced. The processor can, though be idle when there are no threads ready for execution, Thus, communication latencies are difficult to hide completely in applications with high communication-to-computation characteristics. This paper presents three mechanisms for the implementation of the communication assist of a Data Driven Multithreaded architecture. The first mechanism relies only on fine grain communication, where each packet can transfer a single value. With the second mechanism, the communication assist is modified to support block data communication through the same fine grain interconnection network of the first configuration. The third mechanism employs a broadcast network such as Ethernet to transfer blocks of data, while fine grain communication is handled the same way as with the other two mechanisms.