Architectural exploration and optimization of local memory in embedded systems
ISSS '97 Proceedings of the 10th international symposium on System synthesis
Improving locality using loop and data transformations in an integrated framework
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Formalized methodology for data reuse exploration for low-power hierarchical memory mappings
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
An affine partitioning algorithm to maximize parallelism and minimize communication
ICS '99 Proceedings of the 13th international conference on Supercomputing
A fully associative software-managed cache design
Proceedings of the 27th annual international symposium on Computer architecture
Dynamic management of scratch-pad memory space
Proceedings of the 38th annual Design Automation Conference
Systematic data reuse exploration methodology for irregular access patterns
ISSS '00 Proceedings of the 13th international symposium on System synthesis
Compiler-directed scratch pad memory hierarchy design and management
Proceedings of the 39th annual Design Automation Conference
Efficient Utilization of Scratch-Pad Memory in Embedded Processor Applications
EDTC '97 Proceedings of the 1997 European conference on Design and Test
Assigning Program and Data Objects to Scratchpad for Energy Reduction
Proceedings of the conference on Design, automation and test in Europe
Data Reuse Exploration Techniques for Loop-Dominated Applications
Proceedings of the conference on Design, automation and test in Europe
Automatic computation and data decomposition for multiprocessors
Automatic computation and data decomposition for multiprocessors
Distributed loop controller architecture for multi-threading in uni-threaded VLIW processors
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Very wide register: an asymmetric register file organization for low power embedded processors
Proceedings of the conference on Design, automation and test in Europe
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
ACM Transactions on Embedded Computing Systems (TECS)
Efficient OpenMP support and extensions for MPSoCs with explicitly managed memory hierarchy
Proceedings of the Conference on Design, Automation and Test in Europe
Dynamic data type optimization and memory assignment methodologies
PATMOS'09 Proceedings of the 19th international conference on Integrated Circuit and System Design: power and Timing Modeling, Optimization and Simulation
Journal of Signal Processing Systems
Hi-index | 0.00 |
This paper presents a compiler strategy to optimize data accesses in regular array-intensive applications running on embedded multiprocessor environments. Specifically, we propose an optimization algorithm that targets at reducing extra off-chip memory accesses caused by interprocessor communication. This is achieved by increasing the application-wide reuse of data that resides in scratch-pad memories of processors. Our results obtained using four array-intensive image processing applications indicate that exploiting interprocessor data sharing can reduce energy-delay product significantly on a four-processor embedded system.