ACM Computing Surveys (CSUR)
Digital signal processing (3rd ed.): principles, algorithms, and applications
Digital signal processing (3rd ed.): principles, algorithms, and applications
A preliminary architecture for a basic data-flow processor
ISCA '75 Proceedings of the 2nd annual symposium on Computer architecture
Computer Architecture: A Quantitative Approach
Computer Architecture: A Quantitative Approach
Validity of the single processor approach to achieving large scale computing capabilities
AFIPS '67 (Spring) Proceedings of the April 18-20, 1967, spring joint computer conference
Principles of Asynchronous Circuit Design: A Systems Perspective
Principles of Asynchronous Circuit Design: A Systems Perspective
The design of a dataflow coprocessor for low power embedded hierarchical processing
PATMOS'06 Proceedings of the 16th international conference on Integrated Circuit and System Design: power and Timing Modeling, Optimization and Simulation
Hi-index | 0.00 |
With the rapid development of silicon technology, chip die size and clock frequency increase; it becomes very difficult to further increase the performance of silicon devices only by speeding up their clock. We believe a more practical way to increase their speed is to use the abundant transistor resource to implement several cores and make the cores execute in parallel. In the paper, we propose a processor-coprocessor architecture to increase the speed of the most frequently-used short program segments and reduce their power consumption. As these segments dominate the dynamic execution trace of embedded programs, the overall increment of performance and power-saving is significant. A dataflow coprocessor and a RISC coprocessor are implemented for comparison. The experimental results show the dataflow coprocessor is faster and more power-efficient than the other one, due to the fact that the dataflow coprocessor offers natural properties for fine-grained instruction-level parallel processing and its parallelism is self-scheduling. Except for data dependencies, there is no constrained sequentiality, so a dataflow program allows all forms of instruction parallelism.