The Manchester prototype dataflow computer
Communications of the ACM - Special section on computer architecture
Cache coherence protocols: evaluation using a multiprocessor simulation model
ACM Transactions on Computer Systems (TOCS)
A class of compatible cache consistency protocols and their support by the IEEE futurebus
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Toward a dataflow/von Neumann hybrid architecture
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
An architecture of a dataflow single chip processor
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Sequential description and parallel execution language DFCII dataflow supercomputers
ICS '91 Proceedings of the 5th international conference on Supercomputing
Comparative evaluation of latency reducing and tolerating techniques
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Improved multithreading techniques for hiding communication latency in multiprocessors
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
A universal parallel computer architecture
Selected papers of international conference on Fifth generation computer systems 92
A critique of multiprocessing von Neumann style
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
A critique of multiprocessing von Neumann style
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Hi-index | 0.00 |
Latency associated with memory accesses and process communications are one of the most difficult obstacles in constructing a practical massively parallel system. So far, two approaches to hide latencies have been proposed. They are prefetching and multi-threading. An instruction-level data-driven computer is an ideal test-bed for evaluating these latency hiding methods because prefetching and multi-threading are naturally implemented in an instruction-level data-driven computer as unfolding and concurrent execution of multiple contexts. This paper evaluates latency hiding methods on SIGMA-1, a dataflow supercomputer developed in Electrotechnical Laboratory. As a result of evaluation, these methods are effective to hide static latencies but not effective to hide dynamic latencies. Also, concurrent execution of multiple contexts is more effective than prefetching.