An Efficient Way of Passing of Data in a Multithreaded Scheduled Dataflow Architecture

Authors:
Joseph M. Arul;Tsozen Yeh;Chiacheng Hsu;Janjr Li
Affiliations:
Fu Jen Catholic University, Hsin Chuang , Taipei, Taiwan.;Fu Jen Catholic University, Hsin Chuang , Taipei, Taiwan.;Fu Jen Catholic University, Hsin Chuang , Taipei, Taiwan.;Fu Jen Catholic University, Hsin Chuang , Taipei, Taiwan.
Venue:
HPCASIA '05 Proceedings of the Eighth International Conference on High-Performance Computing in Asia-Pacific Region
Year:
2005

Citing 16
Cited 0

A unified resource management and execution control mechanism for data flow machines

ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Executing a Program on the MIT Tagged-Token Dataflow Architecture

IEEE Transactions on Computers
Limits of instruction-level parallelism

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Super-threading: architectural and software mechanisms for optimizing parallel computation

ICS '93 Proceedings of the 7th international conference on Supercomputing
Converting thread-level parallelism to instruction-level parallelism via simultaneous multithreading

ACM Transactions on Computer Systems (TOCS)
Software-Directed Register Deallocation for Simultaneous Multithreaded Processors

IEEE Transactions on Parallel and Distributed Systems
Quantitative evaluation of pipelining and decoupling a dynamic instruction scheduling mechanism

Journal of Systems Architecture: the EUROMICRO Journal
Optimizations Enabled by a Decoupled Front-End Architecture

IEEE Transactions on Computers
Scheduled Dataflow: Execution Paradigm, Architecture, and Performance Evaluation

IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
Instruction-Level Distributed Processing

Computer
A preliminary architecture for a basic data-flow processor

ISCA '75 Proceedings of the 2nd annual symposium on Computer architecture
A Multithreaded Processor Designed for Distributed Shared Memory Systems

APDC '97 Proceedings of the 1997 Advances in Parallel and Distributed Computing Conference (APDC '97)
On the working set concept for data-flow machines

ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Decoupled access/execute computer architectures

ISCA '82 Proceedings of the 9th annual symposium on Computer Architecture
Scheduling Byte Code-Defined Data Dependence Graphs of Object Oriented Programs

PARELEC '04 Proceedings of the international conference on Parallel Computing in Electrical Engineering
Data Flow Supercomputers

Computer

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Scheduled Dataflow (SDF) architecture deviates from current trend of building complex hardware to exploit Instruction Level Parallelism (ILP) by exploring a simpler, yet powerful execution paradigm that is based on dataflow, multithreading and decoupling of memory accesses from execution. A program is partitioned into non-blocking threads and all memory accesses are decoupled from the thread's execution. Data is pre-loaded into the thread's context (registers), and all results are post-stored after the completion of the thread's execution. This paper presents an efficient way of storing of data into the thread's register context directly as opposed to storing of data into the frame memory. This way eliminates the need for creating thread frames when there are sufficient register contexts available in the system. Thus, it is possible to explore the scalability of SDF architecture's performance when more register contexts are available on the chip. All the benchmarks ran using these two methods show performance improvement of at least about 20%. This method of allocating data to a consecutive thread in a multithreaded architecture could be applied generally.