Partitioning and Mapping Algorithms into Fixed Size Systolic Arrays
IEEE Transactions on Computers
VLSI array processors
High-level synthesis: introduction to chip and system design
High-level synthesis: introduction to chip and system design
In-place memory management of algebraic algorithms on application specific ICs
Journal of VLSI Signal Processing Systems - Special issue: algorithms and parallel VSLI architecture
Memory estimation for high level synthesis
DAC '94 Proceedings of the 31st annual Design Automation Conference
Compile-Time Partitioning of Iterative Parallel Loops to Reduce Cache Coherency Traffic
IEEE Transactions on Parallel and Distributed Systems
Constructive Methods for Scheduling Uniform Loop Nests
IEEE Transactions on Parallel and Distributed Systems
Communication-Free Data Allocation Techniques for Parallelizing Compilers on Multicomputers
IEEE Transactions on Parallel and Distributed Systems
Pursuing a Petaflop: Point Designs for 100 TF Computers Using PIM Technologies
FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
On combining iteration space tiling with data space tiling for scratch-pad memory systems
Proceedings of the 2005 Asia and South Pacific Design Automation Conference
IEEE Transactions on Parallel and Distributed Systems
Reducing off-chip memory access via stream-conscious tiling on multimedia applications
International Journal of Parallel Programming
Hi-index | 14.98 |
Uniform nested loops are broadly used in scientific and multidimensional digital signal processing applications. Due to the amount of data handled by such applications, on-chip memory is required to improve the data access and overall system performance. In this study a static data scheduling method, carrot-hole data scheduling, is proposed for multidimensional applications, in order to control the data traffic between different levels of memory. Based on this data schedule, optimal partitioning and scheduling are selected. Experiments show that by using this technique, on-chip memory misses are significantly reduced as compared to results obtained from traditional methods. The carrot-hole data scheduling method is proven to obtain smallest on-chip memory misses compared with other linear scheduling and partitioning schemes.