REMARC (abstract): reconfigurable multimedia array coprocessor
FPGA '98 Proceedings of the 1998 ACM/SIGDA sixth international symposium on Field programmable gate arrays
A DAG-based design approach for reconfigurable VLIW processors
DATE '99 Proceedings of the conference on Design, automation and test in Europe
IEEE Transactions on Computers
A decade of reconfigurable computing: a visionary retrospective
Proceedings of the conference on Design, automation and test in Europe
Instruction generation and regularity extraction for reconfigurable processors
CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
Synthesis of custom processors based on extensible platforms
Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Garp: a MIPS processor with a reconfigurable coprocessor
FCCM '97 Proceedings of the 5th IEEE Symposium on FPGA-Based Custom Computing Machines
SPARK: A High-Lev l Synthesis Framework For Applying Parallelizing Compiler Transformations
VLSID '03 Proceedings of the 16th International Conference on VLSI Design
Automatic generation of application specific processors
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
The chimaera reconfigurable functional unit
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Characterizing embedded applications for instruction-set extensible processors
Proceedings of the 41st annual Design Automation Conference
Proceedings of the 31st annual international symposium on Computer architecture
Automatic application-specific instruction-set extensions under microarchitectural constraints
International Journal of Parallel Programming - Special issue: Workshop on application specific processors (WASP)
The MOLEN Polymorphic Processor
IEEE Transactions on Computers
Dynamic Strands: Collapsing Speculative Dependence Chains for Reducing Pipeline Communication
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Dataflow Mini-Graphs: Amplifying Superscalar Capacity and Bandwidth
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Instruction set extension with shadow registers for configurable processors
Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays
Run-Time Reconfigurable Systems for Digital Signal Processing Applications: A Survey
Journal of VLSI Signal Processing Systems
An Architecture Framework for Transparent Instruction Set Customization in Embedded Processors
Proceedings of the 32nd annual international symposium on Computer Architecture
Exploiting pipelining to relax register-file port constraints of instruction-set extensions
Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems
Automated Instruction-Set Extension of Embedded Processors with Application to MPEG-4 Video Encoding
ASAP '05 Proceedings of the 2005 IEEE International Conference on Application-Specific Systems, Architecture Processors
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
The ArchC architecture description language and tools
International Journal of Parallel Programming
Exploiting forwarding to improve data bandwidth of instruction-set extensions
Proceedings of the 43rd annual Design Automation Conference
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
An Application Development Framework for ARISE Reconfigurable Processors
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Near-Optimal Microprocessor and Accelerators Codesign with Latency and Throughput Constraints
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
ARISE introduces a systematic approach for extending once an embedded processor to support thereafter the coupling of an arbitrary number of custom computing units (CCUs). A CCU can be a hardwired or a reconfigurable unit, which can be utilized following a tight and/or loose model of computation. By selecting the appropriate model of computation for each part of the application, the complete application space is considered for acceleration, resulting in significant performance improvements. Also, ARISE offers modularity and scalability and is not restricted by the opcode space and operands limitation problems that exist in such type of machines. To support these features we introduce a machine organization that allows the cooperation of a processor and a set of CCUs. To control the CCUs we extend once the instruction set of the processor with eight instructions. To efficiently incorporate these features to an embedded processor, we propose a micro-architecture implementation that minimizes the control and communication overhead between the processor and the CCUs. To evaluate our proposal, we extended a MIPS processor with the ARISE infrastructure and implemented it on a Xilinx field-programmable gate array (FPGA). Implementation results, demonstrate that the timing model of the processor is not affected. Also, we implemented a set of benchmarks on the ARISE evaluation machine. Performance results prove significant improvements and reduced communication overhead compared to a typical coprocessor approach.