Bulldog: a compiler for VLSI architectures
Bulldog: a compiler for VLSI architectures
High-level synthesis: introduction to chip and system design
High-level synthesis: introduction to chip and system design
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
DAC '86 Proceedings of the 23rd ACM/IEEE Design Automation Conference
Describing instruction set processors using nML
EDTC '95 Proceedings of the 1995 European conference on Design and Test
A Graph Based Processor Model for Retargetable Code Generation
EDTC '96 Proceedings of the 1996 European conference on Design and Test
Retargetable Generation of Code Selectors from HDL Processor Models
EDTC '97 Proceedings of the 1997 European conference on Design and Test
Cone Based Clustering for List Scheduling Algorithms
EDTC '97 Proceedings of the 1997 European conference on Design and Test
An Efficient List-Based Scheduling Algorithm for High-Level Synthesis
DSD '02 Proceedings of the Euromicro Symposium on Digital Systems Design
The mimola design system: Tools for the design of digital processors
DAC '84 Proceedings of the 21st Design Automation Conference
ASIP Design Methodologies: Survey and Issues
VLSID '01 Proceedings of the The 14th International Conference on VLSI Design (VLSID '01)
Designing a custom architecture for DCT using NISC technology
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
FPGA-friendly code compression for horizontal microcoded custom IPs
Proceedings of the 2007 ACM/SIGDA 15th international symposium on Field programmable gate arrays
Proceedings of the conference on Design, automation and test in Europe
VEBoC: variation and error-aware design for billions of devices on a chip
Proceedings of the 2008 Asia and South Pacific Design Automation Conference
Merged Dictionary Code Compression for FPGA Implementation of Custom Microcoded PEs
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
A Flexible Code Compression Scheme Using Partitioned Look-Up Tables
HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
FlexCore: Utilizing Exposed Datapath Control for Efficient Computing
Journal of Signal Processing Systems
Squashing microcode stores to size in embedded systems while delivering rapid microcode accesses
CODES+ISSS '09 Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis
High performance and area efficient flexible DSP datapath synthesis
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Enforcing architectural contracts in high-level synthesis
Proceedings of the 48th Design Automation Conference
Compiling high throughput network processors
Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
Automated generation of custom processor core from C code
Journal of Electrical and Computer Engineering - Special issue on ESL Design Methodology
Synthesis of networks of custom processing elements for real-time physical system emulation
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Improving processor efficiency by statically pipelining instructions
Proceedings of the 14th ACM SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems
ACM Transactions on Embedded Computing Systems (TECS) - Special issue on application-specific processors
Hi-index | 0.00 |
Performance of programs can be improved by utilizing their horizontal and vertical parallelism. In some processors (VLIW based), compiler can utilize horizontal parallelism by controlling the schedule of independent operations. Vertical parallelism is utilized through pipelining. However, in all processors, structure of pipeline is fixed and compiler has no control over it. In Application-Specific-Instruction set- Processors (ASIPs), pipeline structure can be customized and utilized in the program through custom instructions. Practical constraints on the instruction decoder limit the number and complexity of custom instructions in ASIPs. Detecting the frequent and beneficial custom instructions and incorporating them in the compiler are complex and sometimes very time consuming tasks. In this paper, we present an architecture that does not limit the number of custom functionalities that can be implemented on its datapath. Instead of using custom instructions and then relying on the decoder in hardware to generate the control signals, we generate the control signal values in compiler. Since there are no predefined instructions in this architecture, we call it No-Instruction-Set-Computer (NISC). The NISC compiler maps the application directly on the datapath. It has complete fine grain control over datapath and hence can very well utilize resources in the hardware as well as horizontal and vertical parallelism in the program. We also explain the algorithm for mapping the CDFG of a program on a given datapath in NISC. Using our algorithm and a NISC architecture with the datapath of a MIPS, we achieved up to 70% speedup over the traditional MIPS compiler. In another experiment, we started from a base architecture and customized it by adding resources and interconnects to increase its horizontal and vertical parallelism. The algorithm achieved up to 15.5 times speedup by utilizing the available parallelism in the program and the datapath.