Strategies for cache and local memory management by global program transformation
Journal of Parallel and Distributed Computing - Special Issue on Languages, Compilers and environments for Parallel Programming
On the problem of optimizing data transfers for complex memory systems
ICS '88 Proceedings of the 2nd international conference on Supercomputing
POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
The cache performance and optimizations of blocked algorithms
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
A data locality optimizing algorithm
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Optimizing for parallelism and data locality
ICS '92 Proceedings of the 6th international conference on Supercomputing
Counting solutions to Presburger formulas: how and why
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Background memory area estimation for multidimensional signal processing systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
ICS '96 Proceedings of the 10th international conference on Supercomputing
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Exact memory size estimation for array computations without loop unrolling
Proceedings of the 36th annual ACM/IEEE Design Automation Conference
Simultaneous reference allocation in code generation for dual data memory bank ASIPs
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design
High Performance Compilers for Parallel Computing
High Performance Compilers for Parallel Computing
A Loop Transformation Theory and an Algorithm to Maximize Parallelism
IEEE Transactions on Parallel and Distributed Systems
On Estimating and Enhancing Cache Effectiveness
Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
A Singular Loop Transformation Framework Based on Non-Singular Matrices
Proceedings of the 5th International Workshop on Languages and Compilers for Parallel Computing
Array Placement for Storage Size Reduction in Embedded Multimedia Systems
ASAP '97 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures and Processors
Enhancing Compiler Techniques for Memory Energy Optimizations
EMSOFT '02 Proceedings of the Second International Conference on Embedded Software
Proceedings of the 1st conference on Computing frontiers
Storage requirement estimation for optimized design of data intensive applications
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Journal of VLSI Signal Processing Systems
Memory optimization by counting points in integer transformations of parametric polytopes
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Mapping multi-dimensional signals into hierarchical memory organizations
Proceedings of the conference on Design, automation and test in Europe
EURASIP Journal on Applied Signal Processing
Computation of storage requirements for multi-dimensional signal processing applications
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Integrated Computer-Aided Engineering
Journal of Signal Processing Systems
Guidance of Loop Ordering for Reduced Memory Usage in Signal Processing Applications
Journal of Signal Processing Systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
On minimizing register usage of linearly scheduled algorithms with uniform dependencies
Computer Languages, Systems and Structures
Embedded Systems Design
Integer affine transformations of parametric ℤ-polytopes and applications to loop nest optimization
ACM Transactions on Architecture and Code Optimization (TACO)
ACM Transactions on Design Automation of Electronic Systems (TODAES)
A scalable and near-optimal representation of access schemes for memory management
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
Most embedded systems have limited amount of memory. In contrast, the memory requirements of code (in particular loops) running on embedded systems is significant. This paper addresses the problem of estimating the amount of memory needed for transfers of data in embedded systems. The problem of estimating the region associated with a statement or the set of elements referenced by a statement during the execution of the entire set of nested loops is analyzed. Aquantitative analysis of the number of elements referenced is presented; exact expressions for uniformly generated references and a close upper and lower bound for non-uniformly generated references are derived. In addition to presenting an algorithm that computes the total memory required, we discuss the effect of transformations on the lifetimes of array variables, i.e., the time between the first and last accesses to a given array location. A detailed analysis on the effect of unimodular transformations on data locality including the calculation of the maximum window size is discussed. The termmaximum window sizeis introduced and quantitative expressions are derived to compute the window size. The smaller the value of the maximum window size, the higher the amount of data locality in the loop.