Force-directed scheduling in automatic data path synthesis
DAC '87 Proceedings of the 24th ACM/IEEE Design Automation Conference
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Lx: a technology platform for customizable VLIW embedded processing
Proceedings of the 27th annual international symposium on Computer architecture
From Behavior to Structure: High-Level Synthesis
IEEE Design & Test
Register allocation & spilling via graph coloring
SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction
TriMedia CPU64 Design Space Exploration
ICCD '99 Proceedings of the 1999 IEEE International Conference on Computer Design
Algorithms for compiler-assisted design space exploration of clustered vliw asip datapaths
Algorithms for compiler-assisted design space exploration of clustered vliw asip datapaths
Employing Compilers for Determining Architectural Features of Application-Specific DSPs
PARELEC '04 Proceedings of the international conference on Parallel Computing in Electrical Engineering
Application-specific clustered VLIW datapaths: early exploration on a parameterized design space
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Hi-index | 0.00 |
In this paper, we introduce a new approach in Design-Space-Exploration (DSE) for non-clustered VLIW architectures. It differs from existing techniques by using a “bottom-up” strategy. While other approaches start with the design of an architecture, followed by building a possible schedule, we firstly build a schedule and after that an architecture is synthesized, which is suitable to execute this schedule. So, the results can be obtained fully automatically and in very short time. Furthermore, we can explore arbitrary types of functional units without increasing the design space exploration time significantly. We evaluated our method and compared the obtained results to an existing DSE approach for clustered and non-clustered architectures. We almost always obtain better results in the case of non-clustered architectures. In many cases the ports of the register file are decreased, which, in consequence, leads to higher clock rates. Compared to the results for clustered architectures for some examples our non-clustered architecture is better than the best clustered one.