The Manchester prototype dataflow computer
Communications of the ACM - Special section on computer architecture
Annual review of computer science vol. 1, 1986
Architecture of a message-driven processor
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Incorporating data flow ideas into von neumann processors for parallel execution
IEEE Transactions on Computers
Toward a dataflow/von Neumann hybrid architecture
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
MASA: a multithreaded processor architecture for parallel symbolic computing
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Two fundamental issues in multiprocessing
4th International DFVLR Seminar on Foundations of Engineering Sciences on Parallel Computing in Science and Engineering
An efficient pipelined dataflow processor architecture
Proceedings of the 1988 ACM/IEEE conference on Supercomputing
The Epsilon dataflow processor
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
An architecture of a dataflow single chip processor
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Can dataflow subsume von Neumann computing?
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Executing a Program on the MIT Tagged-Token Dataflow Architecture
IEEE Transactions on Computers
Journal of Parallel and Distributed Computing - Special issue: data-flow processing
Implementation of a general-purpose dataflow multiprocessor
Implementation of a general-purpose dataflow multiprocessor
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Multithreading: a revisionist view of dataflow architectures
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Compiler-controlled multithreading for lenient parallel languages
Proceedings of the 5th ACM conference on Functional programming languages and computer architecture
Highly parallel computing
ICS '90 Proceedings of the 4th international conference on Supercomputing
APRIL: a processor architecture for multiprocessing
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
The directory-based cache coherence protocol for the DASH multiprocessor
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Global analysis for partitioning non-strict programs into sequential threads
LFP '92 Proceedings of the 1992 ACM conference on LISP and functional programming
A tightly-coupled processor-network interface
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Multithreaded computer systems
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Improving AP1000 parallel computer performance with message communication
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Generation and quantitative evaluation of dataflow clusters
FPCA '93 Proceedings of the conference on Functional programming languages and computer architecture
Efficient implementation of sequential loops in dataflow computation
FPCA '93 Proceedings of the conference on Functional programming languages and computer architecture
Super-threading: architectural and software mechanisms for optimizing parallel computation
ICS '93 Proceedings of the 7th international conference on Supercomputing
T: integrated building blocks for parallel computing
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Performance evaluation of hybrid hardware and software distributed shared memory protocols
ICS '94 Proceedings of the 8th international conference on Supercomputing
Programming, compilation, and resource management issues for multithreading (panel session II)
ACM SIGARCH Computer Architecture News - Special issue: panel sessions of the 1991 workshop on multithreaded computers
Virtual memory mapped network interface for the SHRIMP multicomputer
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
The Stanford FLASH multiprocessor
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Tempest and typhoon: user-level shared memory
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Separating data and control transfer in distributed operating systems
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Integration of message passing and shared memory in the Stanford FLASH multiprocessor
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
The performance impact of flexibility in the Stanford FLASH multiprocessor
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
POPL '95 Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
The MIT Alewife machine: architecture and performance
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
The EM-X parallel computer: architecture and basic performance
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
ICS '95 Proceedings of the 9th international conference on Supercomputing
Ordered multithreading: a novel technique for exploiting thread-level parallelism
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
A design study of the EARTH multiprocessor
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Analysis of communications and overhead reduction in multithreaded execution
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Control of loop parallelism in multithreaded code
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Effects of data bundling in non-strict data structures
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Early experience with message-passing on the SHRIMP multicomputer
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Synchronization and communication in the T3E multiprocessor
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Improving single-process performance with multithreaded processors
ICS '96 Proceedings of the 10th international conference on Supercomputing
An evaluation of bottom-up and top-down thread generation techniques
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Fine-grain multithreading with the EM-X multiprocessor
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Thread partitioning and scheduling based on cost model
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
High-Throughput, Low-Memory Applications on the Pica Architecture
IEEE Transactions on Parallel and Distributed Systems
Exploiting fine-grain thread level parallelism on the MIT multi-ALU processor
Proceedings of the 25th annual international symposium on Computer architecture
Pc-based Shared Memory Architecture and Language
The Journal of Supercomputing
25 years of the international symposia on Computer architecture (selected papers)
Retrospective: Monsoon: an explicit token-store architecture
25 years of the international symposia on Computer architecture (selected papers)
Virtual memory mapped network interface for the SHRIMP multicomputer
25 years of the international symposia on Computer architecture (selected papers)
The Stanford FLASH multiprocessor
25 years of the international symposia on Computer architecture (selected papers)
Tempest and typhoon: user-level shared memory
25 years of the international symposia on Computer architecture (selected papers)
The MIT Alewife machine: architecture and performance
25 years of the international symposia on Computer architecture (selected papers)
Dynamic remote memory acquisition for parallel data mining on ATM-connected PC cluster
ICS '99 Proceedings of the 13th international conference on Supercomputing
A personal supercomputer for climate research
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Proceedings of the 38th annual Design Automation Conference
The Sisal project: real world functional programming
Compiler optimizations for scalable parallel systems
Tolerating communication latency through dynamic thread invocation in a multithreaded architecture
Compiler optimizations for scalable parallel systems
Asynchrony in parallel computing: from dataflow to multithreading
Progress in computer research
Fine-Grained Multithreading with Process Calculi
IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
Asynchrony in parallel computing: from dataflow to multithreading
Progress in computer research
Application-specific protocols for user-level shared memory
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Cache Memories for Dataflow Systems
IEEE Parallel & Distributed Technology: Systems & Technology
Virtual-Memory-Mapped Network Interfaces
IEEE Micro
Caches with Compositional Performance
Embedded Processor Design Challenges: Systems, Architectures, Modeling, and Simulation - SAMOS
A Graph-Oriented Task Manager for Small Multiprocessor Systems
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
An Evaluation of Optimized Threaded Code Generation
PACT '94 Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques
A Fine-Grain Threaded Abstract Machine
PACT '94 Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques
The Initial Performance of a Bottom-Up Clustering Algorithm for Dataflow Graphs
PACT '93 Proceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism
A New Parallelism Management Scheme for Multiprocessor Systems
ParNum '99 Proceedings of the 4th International ACPC Conference Including Special Tracks on Parallel Numerics and Parallel Computing in Image Processing, Video Processing, and Multimedia: Parallel Computation
Caches with compositional performance
Embedded processor design challenges
Data locality sensitivity of multithreaded computations on a distributed-memory multiprocessor
CASCON '96 Proceedings of the 1996 conference of the Centre for Advanced Studies on Collaborative research
FTL: a multithreaded environment for parallel computation
CASCON '94 Proceedings of the 1994 conference of the Centre for Advanced Studies on Collaborative research
A practical processor design for multithreading
FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
Design and performance evaluation of a multithreaded architecture
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Fine-grain multi-thread processor architecture for massively parallel processing
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Protected, user-level DMA for the SHRIMP network interface
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Mini-Threads: Increasing TLP on Small-Scale SMT Processors
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
The Design and Simulation of the PACE Prototype Architecture
MASCOTS '96 Proceedings of the 4th International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
The Sisal Model of Functional Programming and its Implementation
PAS '97 Proceedings of the 2nd AIZU International Symposium on Parallel Algorithms / Architecture Synthesis
An Architecture based on the Memory Mapped Node Addressing in Reconfigurable Interconnection Network
PAS '97 Proceedings of the 2nd AIZU International Symposium on Parallel Algorithms / Architecture Synthesis
Algorithm + strategy = parallelism
Journal of Functional Programming
Performance Analysis of System Overheads in TCP/IP Workloads
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Data-Driven Multithreading Using Conventional Microprocessors
IEEE Transactions on Parallel and Distributed Systems
Integrated network interfaces for high-bandwidth TCP/IP
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Fuce: the continuation-based multithreading processor
Proceedings of the 4th international conference on Computing frontiers
Proceedings of the 4th international conference on Computing frontiers
Multithreaded architecture for multimedia processing
Integrated Computer-Aided Engineering
A continuation-based noninterruptible multithreading processor architecture
The Journal of Supercomputing
Research works on cluster computing and storage area network
Proceedings of the 3rd International Conference on Ubiquitous Information Management and Communication
Cache management for discrete processor architectures
ISPA'05 Proceedings of the Third international conference on Parallel and Distributed Processing and Applications
Hi-index | 0.00 |
What should the architecture of each node in a general purpose, massively parallel architecture (MPA) be? We frame the question in concrete terms by describing two fundamental problems that must be solved well in any general purpose MPA. From this, we systematically develop the required logical organization of an MPA node, and present some details of *T (pronounced Start, a concrete architecture designed to these requirements. *T is a direct descendant of dynamic dataflow architectures, and unifies them with von Neumann architectures. We discuss a hand-compiled example and some compilation issues.