A simple on-line bin-packing algorithm
Journal of the ACM (JACM)
Design and Implementation of the MorphoSys Reconfigurable ComputingProcessor
Journal of VLSI Signal Processing Systems - Special issue on VLSI on custom computing technology
A HW/SW partitioning algorithm for dynamically reconfigurable architectures
Proceedings of the conference on Design, automation and test in Europe
Hardware-software cosynthesis for run-time incrementally reconfigurable FPGAs
ASP-DAC '00 Proceedings of the 2000 Asia and South Pacific Design Automation Conference
Introduction to Algorithms
Design Methodology of a Low-Energy Reconfigurable Single-Chip DSP System
Journal of VLSI Signal Processing Systems
The Garp Architecture and C Compiler
Computer
Fast Template Placement for Reconfigurable Computing Systems
IEEE Design & Test
Configurable computing: the catalyst for high-performance architectures
ASAP '97 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures and Processors
The Chimaera reconfigurable functional unit
FCCM '97 Proceedings of the 5th IEEE Symposium on FPGA-Based Custom Computing Machines
A Quantitative Analysis of Reconfigurable Coprocessors for Multimedia Applications
FCCM '98 Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines
Reconfigurable Instruction Set Processors: A Survey
RSP '00 Proceedings of the 11th IEEE International Workshop on Rapid System Prototyping (RSP 2000)
Fast Online Task Placement on FPGAs: Free Space Partitioning and 2D-Hashing
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
A Flexible and Energy-Efficient Coarse-Grained Reconfigurable Architecture for Mobile Systems
The Journal of Supercomputing
Computer
Saving Power by Mapping Finite-State Machines into Embedded Memory Blocks in FPGAs
Proceedings of the conference on Design, automation and test in Europe - Volume 2
Design Methodology for a Tightly Coupled VLIW/Reconfigurable Matrix Architecture: A Case Study
Proceedings of the conference on Design, automation and test in Europe - Volume 2
Synchroscalar: A Multiple Clock Domain, Power-Aware, Tile-Based Embedded Processor
Proceedings of the 31st annual international symposium on Computer architecture
Efficient search space exploration for HW-SW partitioning
Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Joint Application Mapping/Interconnect Synthesis Techniques for Embedded Chip-Scale Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
NoC Synthesis Flow for Customized Domain Specific Multiprocessor Systems-on-Chip
IEEE Transactions on Parallel and Distributed Systems
Cost considerations in network on chip
Integration, the VLSI Journal - Special issue: Networks on chip and reconfigurable fabrics
Proceedings of the 42nd annual Design Automation Conference
Reconfigurable Computing: Accelerating Computation with Field-Programmable Gate Arrays
Reconfigurable Computing: Accelerating Computation with Field-Programmable Gate Arrays
Hierarchical graph: a new cost effective architecture for network on chip
EUC'05 Proceedings of the 2005 international conference on Embedded and Ubiquitous Computing
Computers and Electrical Engineering
A 1GHz, DDR2/3 SSTL driver with On-Die Termination, strength calibration, and slew rate control
Computers and Electrical Engineering
Computers and Electrical Engineering
Hi-index | 0.00 |
There are many design challenges in the hardware-software co-design approach for performance improvement of data-intensive streaming applications with a general-purpose microprocessor and a hardware accelerator. These design challenges are mainly to prevent hardware area fragmentation to increase resource utilization, to reduce hardware reconfiguration cost and to partition and schedule the tasks between the microprocessor and the hardware accelerator efficiently for performance improvement and power savings of the applications. In this paper a modular and block based hardware configuration architecture named memory-aware run-time reconfigurable embedded system (MARTRES) is proposed for efficient resource management and performance improvement of streaming applications. Subsequently we design a task placement algorithm named hierarchical best fit ascending (HBFA) algorithm to prove that MARTRES configuration architecture is very efficient in increased resource utilization and flexible in task mapping and power savings. The time complexity of HBFA algorithm is reduced to O(n) compared to traditional Best Fit (BF) algorithm's time complexity of O(n^2), when the quality of the placement solution by HBFA is better than that of BF algorithm. Finally we design an efficient task partitioning and scheduling algorithm named balanced partitioned and placement-aware partitioning and scheduling algorithm (BPASA). In BPASA we exploit the temporal parallelism in streaming applications to reduce reconfiguration cost of the hardware, while keeping in mind the required throughput of the output data. We balance the exploitation of spatial parallelism and temporal parallelism in streaming applications by considering the reconfiguration cost vs. the data transfer cost. The scheduler refers to the HBFA placement algorithm to check whether contiguous area on FPGA is available before scheduling the task for HW or for SW.