MULTILISP: a language for concurrent symbolic computation
ACM Transactions on Programming Languages and Systems (TOPLAS)
Incorporating data flow ideas into von neumann processors for parallel execution
IEEE Transactions on Computers
Resource requirements of dataflow programs
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
MASA: a multithreaded processor architecture for parallel symbolic computing
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Two fundamental issues in multiprocessing
4th International DFVLR Seminar on Foundations of Engineering Sciences on Parallel Computing in Science and Engineering
Reduced instruction set computers
Communications of the ACM - Special section on computer architecture
Partitioning parallel programs for macro-dataflow
LFP '86 Proceedings of the 1986 ACM conference on LISP and functional programming
Interprocess communication and processor dispatching on the Intel 432
ACM Transactions on Computer Systems (TOCS)
Communications of the ACM - Special issue on computer architecture
Very Long Instruction Word architectures and the ELI-512
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Performance measurements on HEP - a pipelined MIMD computer
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
A critique of multiprocessing von Neumann style
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
A critique of multiprocessing von Neumann style
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
ASPLOS I Proceedings of the first international symposium on Architectural support for programming languages and operating systems
Architectural support for the efficient generation of code for horizontal architectures
ASPLOS I Proceedings of the first international symposium on Architectural support for programming languages and operating systems
RESOURCE MANAGEMENT FOR THE TAGGED TOKEN DATAFLOW ARCHITECTURE
RESOURCE MANAGEMENT FOR THE TAGGED TOKEN DATAFLOW ARCHITECTURE
A COMPILER FOR THE MIT TAGGED-TOKEN DATAFLOW ARCHITECTURE
A COMPILER FOR THE MIT TAGGED-TOKEN DATAFLOW ARCHITECTURE
Can dataflow subsume von Neumann computing?
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Analysis of multithreaded architectures for parallel computing
SPAA '90 Proceedings of the second annual ACM symposium on Parallel algorithms and architectures
TWIST-TOP: transputers with I-stores test out processor
CSC '90 Proceedings of the 1990 ACM annual conference on Cooperation
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Comparative evaluation of latency reducing and tolerating techniques
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Multithreading: a revisionist view of dataflow architectures
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Distributed Instruction Set Computer Architecture
IEEE Transactions on Computers
Executing DSP Applications in a Fine-Grained Dataflow Environment
IEEE Transactions on Software Engineering
SPIRE: streaming processing with instructions release element
ACM SIGARCH Computer Architecture News
Hiding memory latency using dynamic scheduling in shared-memory multiprocessors
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The expandable split window paradigm for exploiting fine-grain parallelsim
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
An elementary processor architecture with simultaneous instruction issuing from multiple threads
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Thread-based programming for the EM-4 hybrid dataflow machine
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
T: a multithreaded massively parallel architecture
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Processor coupling: integrating compile time and runtime scheduling for parallelism
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Cache Memories for Data Flow Machines
IEEE Transactions on Computers
Microarchitecture support for dynamic scheduling of acyclic task graphs
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
An efficient implementation scheme of concurrent object-oriented languages on stock multicomputers
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Register relocation: flexible contexts for multithreading
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Generation and quantitative evaluation of dataflow clusters
FPCA '93 Proceedings of the conference on Functional programming languages and computer architecture
Data stream control optimization in dataflow architectures
ICS '93 Proceedings of the 7th international conference on Supercomputing
Empirical study of latency hiding on a fine-grain parallel processor
ICS '93 Proceedings of the 7th international conference on Supercomputing
Space-efficient scheduling of multithreaded computations
STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
A model for dataflow based vector execution
ICS '94 Proceedings of the 8th international conference on Supercomputing
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Design of cache memories for multi-threaded dataflow architecture
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Analysis of communications and overhead reduction in multithreaded execution
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
A basic architecture supporting LGDG computation
ICS '90 Proceedings of the 4th international conference on Supercomputing
An evaluation of bottom-up and top-down thread generation techniques
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Thread partitioning and scheduling based on cost model
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Retrospective: a preliminary architecture for a basic data flow processor
25 years of the international symposia on Computer architecture (selected papers)
Retrospective: multiscalar processors
25 years of the international symposia on Computer architecture (selected papers)
APRIL: a processor architecture for multiprocessing
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Asynchrony in parallel computing: from dataflow to multithreading
Progress in computer research
Scheduled Dataflow: Execution Paradigm, Architecture, and Performance Evaluation
IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
Asynchrony in parallel computing: from dataflow to multithreading
Progress in computer research
Speculative Multithreaded Processors
Computer
Performance Tradeoffs in Multithreaded Processors
IEEE Transactions on Parallel and Distributed Systems
Amir Roth: Speculative Multithreaded Processors
HiPC '00 Proceedings of the 7th International Conference on High Performance Computing
Asynchronous Resource Management
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
An Evaluation of Optimized Threaded Code Generation
PACT '94 Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques
The Initial Performance of a Bottom-Up Clustering Algorithm for Dataflow Graphs
PACT '93 Proceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism
Two Fundamental Limits on Dataflow Multiprocessing
PACT '93 Proceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism
The Named-State Register File: Implementation and Performance
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Design and performance evaluation of a multithreaded architecture
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Advances in dataflow programming languages
ACM Computing Surveys (CSUR)
Analysis and Modeling of Advanced PIM Architecture Design Tradeoffs
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
CODACS Prototype: A Platform-Processor for CHIARA Programs
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 13 - Volume 14
Proceedings of the 33rd annual international symposium on Computer Architecture
Distributed Microarchitectural Protocols in the TRIPS Prototype Processor
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Multithreaded architecture for multimedia processing
Integrated Computer-Aided Engineering
Enhancing Microkernel Performance on VLIW DSP Processors via Multiset Context Switch
Journal of Signal Processing Systems
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
Decomposition of Task-Level Concurrency on C Programs Applied to the Design of Multiprocessor SoC
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
An expressive language and efficient execution system for software agents
Journal of Artificial Intelligence Research
Task superscalar: using processors as functional units
HotPar'10 Proceedings of the 2nd USENIX conference on Hot topics in parallelism
Hi-index | 0.01 |
Dataflow architectures offer the ability to trade program level parallelism in order to overcome machine level latency. Dataflow further offers a uniform synchronization paradigm, representing one end of a spectrum wherein the unit of scheduling is a single instruction. At the opposite extreme are the von Neumann architectures which schedule on a task, or process, basis.This paper examines the spectrum by proposing a new architecture which is a hybrid of dataflow and von Neumann organizations. The analysis attempts to discover those features of the dataflow architecture, lacking in a von Neumann machine, which are essential for tolerating latency and synchronization costs. These features are captured in the concept of a parallel machine language which can be grafted on top of an otherwise traditional von Neumann base. In such an architecture, the units of scheduling, called scheduling quanta, are bound at compile time rather than at instruction set design time. The parallel machine language supports this notion via a large synchronization name space.A prototypical architecture is described, and results of simulation studies are presented. A comparison is made between the MIT Tagged-Token Dataflow machine and the subject machine which presents a model for understanding the cost of synchronization in a parallel environment.