Improving functional density through run-time constant propagation
FPGA '97 Proceedings of the 1997 ACM fifth international symposium on Field-programmable gate arrays
Global optimization for mapping parallel image processing tasks on distributed memory machines
Journal of Parallel and Distributed Computing
Reconfigurable computing: what, why, and implications for design automation
Proceedings of the 36th annual ACM/IEEE Design Automation Conference
A framework for reconfigurable computing: task scheduling and context management
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - System Level Design
Reconfigurable computing: a survey of systems and software
ACM Computing Surveys (CSUR)
JPEG 2000: Image Compression Fundamentals, Standards and Practice
JPEG 2000: Image Compression Fundamentals, Standards and Practice
Pipelined Data Parallel Algorithms-I: Concept and Modeling
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
FPL '99 Proceedings of the 9th International Workshop on Field-Programmable Logic and Applications
HW/SW codesign techniques for dynamically reconfigurable architectures
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Performance of reconfigurable architectures for image-processing applications
Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Reconfigurable systems
An optimal algorithm for minimizing run-time reconfiguration delay
ACM Transactions on Embedded Computing Systems (TECS)
Operating Systems for Reconfigurable Embedded Platforms: Online Scheduling of Real-Time Tasks
IEEE Transactions on Computers
FCCM '05 Proceedings of the 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
Scheduling divisible loads on partially reconfigurable hardware
FCCM '06 Proceedings of the 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Modern development methods and tools for embedded reconfigurable systems: A survey
Integration, the VLSI Journal
A task graph execution manager for reconfigurable multi-tasking systems
Microprocessors & Microsystems
Bandwidth Management in Application Mapping for Dynamically Reconfigurable Architectures
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Methodology for Efficient Execution of SPMD Applications on Multicore Environments
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Hi-index | 0.00 |
Reconfigurable hybrid processor systems provide a flexible platform for mapping data-parallel applications, while providing considerable speedup over software implementations. However, the overhead for reconfiguration presents a significant deterrent in mapping applications onto reconfigurable hardware. Partial runtime reconfiguration is one approach to reduce the reconfiguration overhead. In this paper, we present a methodology to map data-parallel tasks onto hardware that supports partial reconfiguration. The aim is to obtain the maximum possible speedup, for a given reconfiguration time, bus speed, and computation speed. The proposed approach involves using multiple, identical but independent processing units in the reconfigurable hardware. Under nonzero reconfiguration overhead, we show that there exists an upper limit on the number of processing units that can be employed beyond which further reduction in execution time is not possible. We obtain solutions for the minimum processing time, the corresponding load distribution, and schedule for data transfer. To demonstrate the applicability of the analysis, we present the following: 1) various plots showing the variation of processing time with different parameters; 2) hardware simulations for two examples, viz., 1-D discrete wavelet transform and finite impulse response filter, targeted to Xilinx field-programmable gate arrays (FPGAs); and 3) experimental results for a hardware prototype implemented on a FPGA board.