Communicating sequential processes
Communicating sequential processes
Static scheduling of synchronous data flow programs for digital signal processing
IEEE Transactions on Computers
STATEMATE: A Working Environment for the Development of Complex Reactive Systems
IEEE Transactions on Software Engineering
Iterative modulo scheduling: an algorithm for software pipelining loops
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
A fully associative software-managed cache design
Proceedings of the 27th annual international symposium on Computer architecture
On-chip vs. off-chip memory: the data partitioning problem in embedded processor-based systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Scheduling with bus access optimization for distributed embedded systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special issue on the 11th international symposium on system-level synthesis and design (ISSS'98)
Dynamic management of scratch-pad memory space
Proceedings of the 38th annual Design Automation Conference
Storage allocation for embedded processors
CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
Heterogeneous memory management for embedded systems
CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
Synchronous Programming of Reactive Systems
Synchronous Programming of Reactive Systems
Reducing energy consumption by dynamic copying of instructions onto onchip memory
Proceedings of the 15th international symposium on System Synthesis
ECCOP '98 Proceedings of the 12th European Conference on Object-Oriented Programming
Scratchpad memory: design alternative for cache on-chip memory in embedded systems
Proceedings of the tenth international symposium on Hardware/software codesign
Hierarchical Scheduling and Allocation of Multirate Systems on Heterogeneous Multiprocessors
EDTC '97 Proceedings of the 1997 European conference on Design and Test
A Performance Evaluation of List Scheduling Heuristics for Task Graphs without Communication Costs
ICPP '00 Proceedings of the 2000 International Workshop on Parallel Processing
Embedded Software for Soc
Design Methodology for a Tightly Coupled VLIW/Reconfigurable Matrix Architecture: A Case Study
Proceedings of the conference on Design, automation and test in Europe - Volume 2
EMBARC: an efficient memory bank assignment algorithm for retargetable compilers
Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Architectural support for copy and tamper-resistant software
Architectural support for copy and tamper-resistant software
Dynamic overlay of scratchpad memory for energy minimization
Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Dynamic allocation for scratch-pad memory using compile-time decisions
ACM Transactions on Embedded Computing Systems (TECS)
Heap data allocation to scratch-pad memory in embedded systems
Journal of Embedded Computing - Cache exploitation in embedded systems
A shared memory module for asynchronous arrays of processors
EURASIP Journal on Embedded Systems
Scratch-pad memory allocation without compiler support for java applications
CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
Scheduling of DSP programs onto multiprocessors for maximumthroughput
IEEE Transactions on Signal Processing
A deblocking filter with two separate modes in block-based video coding
IEEE Transactions on Circuits and Systems for Video Technology
ARC '09 Proceedings of the 5th International Workshop on Reconfigurable Computing: Architectures, Tools and Applications
Hi-index | 0.01 |
Reconfigurable fabrics are designed by tiling operators and memory banks. In the context of system on chip, the inclusion of multiple local memories is critical for algorithmic performance, as they provide concurrent data accesses for configured compute processes. This paper considers a practical case where internal fabric buses and connectivity give a shared memory characteristic to the architecture. This relies on static reconfigurability and high-level programming techniques to render automated memory access scheduling feasible in a deterministic manner. A complete flow has been developed starting from the programming model down to micro-code enabling task synchronization on memory resources. Compile time analysis is achieved by observing the sequence of operations in the concurrent processes, and by synthesizing a controller program to support the best schedule of operations favoring high throughput. The hardware target is a reconfigurable fabric designed at STMicroelectronics in 65nm. This hardware/software solution is scalable, flexible and provides high throughput on shared memory.