REAL: a program for REgister ALlocation
DAC '87 Proceedings of the 24th ACM/IEEE Design Automation Conference
Simultaneous scheduling and allocation for cost constrained optimal architectural synthesis
DAC '91 Proceedings of the 28th ACM/IEEE Design Automation Conference
Optimizing for parallelism and data locality
ICS '92 Proceedings of the 6th international conference on Supercomputing
A singular value decomposition updating algorithm for subspace tracking
SIAM Journal on Matrix Analysis and Applications
Comprehensive lower bound estimation from behavioral descriptions
ICCAD '94 Proceedings of the 1994 IEEE/ACM international conference on Computer-aided design
Memory estimation for high level synthesis
DAC '94 Proceedings of the 31st annual Design Automation Conference
Background memory area estimation for multidimensional signal processing systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Improving data locality with loop transformations
ACM Transactions on Programming Languages and Systems (TOPLAS)
Parallel Computing - Special issue on applications: parallel processing and multimedia
Memory size estimation for multimedia applications
Proceedings of the 6th international workshop on Hardware/software codesign
The art of computer programming, volume 3: (2nd ed.) sorting and searching
The art of computer programming, volume 3: (2nd ed.) sorting and searching
Parametric Analysis of Polyhedral Iteration Spaces
Journal of VLSI Signal Processing Systems - Special issue on application specific systems, architectures and processors
Formalized methodology for data reuse exploration for low-power hierarchical memory mappings
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Exact memory size estimation for array computations without loop unrolling
Proceedings of the 36th annual ACM/IEEE Design Automation Conference
A preprocessing step for global loop transformations for data transfer optimization
CASES '00 Proceedings of the 2000 international conference on Compilers, architecture, and synthesis for embedded systems
Optimizing memory usage in the polyhedral model
ACM Transactions on Programming Languages and Systems (TOPLAS)
Reducing memory requirements of nested loops for embedded systems
Proceedings of the 38th annual Design Automation Conference
Proceedings of the 38th annual Design Automation Conference
SIGPLAN '84 Proceedings of the 1984 SIGPLAN symposium on Compiler construction
Optimizing compilers for modern architectures: a dependence-based approach
Optimizing compilers for modern architectures: a dependence-based approach
Dependence Analysis for Supercomputing
Dependence Analysis for Supercomputing
Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design
Storage Management Programmable Process
Storage Management Programmable Process
On Uniformization of Affine Dependence Algorithms
IEEE Transactions on Computers
A Loop Transformation Theory and an Algorithm to Maximize Parallelism
IEEE Transactions on Parallel and Distributed Systems
An Exact Method for Analysis of Value-based Array Data Dependences
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Optimizing Storage Size for Static Control Programs in Automatic Parallelizers
Euro-Par '97 Proceedings of the Third International Euro-Par Conference on Parallel Processing
Array Placement for Storage Size Reduction in Embedded Multimedia Systems
ASAP '97 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures and Processors
Storage requirement estimation for optimized design of data intensive applications
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Lattice-Based Memory Allocation
IEEE Transactions on Computers
Hierarchical memory size estimation for loop fusion and loop shifting in data-dominated applications
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
Computation of storage requirements for multi-dimensional signal processing applications
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Transformation to dynamic single assignment using a simple data flow analysis
APLAS'05 Proceedings of the Third Asian conference on Programming Languages and Systems
Systematic preprocessing of data dependent constructs for embedded systems
PATMOS'05 Proceedings of the 15th international conference on Integrated Circuit and System Design: power and Timing Modeling, Optimization and Simulation
Guest editorial: system level design
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - System Level Design
Local memory exploration and optimization in embedded systems
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Application-driven synthesis of memory-intensive systems-on-chip
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
A compiler-based approach for dynamically managing scratch-pad memories in embedded systems
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Hi-index | 0.00 |
Data dominated signal processing applications are typically described using large and multi-dimensional arrays and loop nests. The order of production and consumption of array elements in these loop nests has huge impact on the amount of memory required during execution. This is essential since the size and complexity of the memory hierarchy is the dominating factor for power, performance and chip size in these applications. This paper presents a number of guiding principles for the ordering of the dimensions in the loop nests. They enable the designer, or design tools, to find the optimal ordering of loop nest dimensions for individual data dependencies in the code. We prove the validity of the guiding principles when no prior restrictions are given regarding fixation of dimensions. If some dimensions are already fixed at given nest levels, this is taken into account when fixing the remaining dimensions. In most cases an optimal ordering is found for this situation as well. The guiding principles can be used in the early design phases in order to enable minimization of the memory requirement through in-place mapping. We use real life examples to show how they can be applied to reach a cost optimized end product. The results show orders of magnitude improvement in memory requirement compared to using the declared array sizes, and similar penalties for choosing the suboptimal ordering of loops when in-place mapping is exploited.