A data locality optimizing algorithm
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
A practical algorithm for exact array dependence analysis
Communications of the ACM
Some efficient solutions to the affine scheduling problem: I. One-dimensional time
International Journal of Parallel Programming
Affine-by-statement scheduling of uniform and affine loop nests over parametric domains
Journal of Parallel and Distributed Computing
Improving data locality with loop transformations
ACM Transactions on Programming Languages and Systems (TOPLAS)
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Exploiting instruction level parallelism in the presence of conditional branches
Exploiting instruction level parallelism in the presence of conditional branches
SpC: synthesis of pointers in C: application of pointer analysis to the behavioral synthesis from C
Proceedings of the 1998 IEEE/ACM international conference on Computer-aided design
System-level power optimization: techniques and tools
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Managing dynamic concurrent tasks in embedded real-time multimedia systems
Proceedings of the 15th international symposium on System Synthesis
A Layout-Conscious Iteration Space Transformation Technique
IEEE Transactions on Computers
Array recovery and high-level transformations for DSP applications
ACM Transactions on Embedded Computing Systems (TECS)
Advanced copy propagation for arrays
Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
Layer Assignment echniques for Low Energy in Multi-Layered Memory Organisations
DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Algorithms to identify pareto points in multi-dimensional data sets
Algorithms to identify pareto points in multi-dimensional data sets
Automatic scenario detection for improved WCET estimation
Proceedings of the 42nd annual Design Automation Conference
Systematic preprocessing of data dependent constructs for embedded systems
PATMOS'05 Proceedings of the 15th international conference on Integrated Circuit and System Design: power and Timing Modeling, Optimization and Simulation
Local memory exploration and optimization in embedded systems
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
IEEE Transactions on Circuits and Systems for Video Technology
Scenario selection and prediction for DVS-aware scheduling of multimedia applications
Journal of Signal Processing Systems - Special Issue: Embedded computing systems for DSP
Journal of Signal Processing Systems
Address Generation Optimization for Embedded High-Performance Processors: A Survey
Journal of Signal Processing Systems
System-scenario-based design of dynamic embedded systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Journal of Signal Processing Systems
ACM Transactions on Embedded Computing Systems (TECS)
Systematic preprocessing of data dependent constructs for embedded systems
PATMOS'05 Proceedings of the 15th international conference on Integrated Circuit and System Design: power and Timing Modeling, Optimization and Simulation
Hi-index | 0.00 |
The data transfers and storage are dominating contributors to the area and power consumption for all modern multimedia embedded systems. Modern high-level memory optimisations can ensure cost-efficient realisation of these systems. An important step in these optimisations are loop transformations performed on a geometrical model. However, these loop transformations traditionally cannot optimise code across data dependent conditions.In this paper we selectively duplicate the code in order to enable global loop transformations across data dependent conditions. We propose a technique which finds in a systematic way the Pareto curve in 2D exploration space: the better memory optimisations vs. the code increase. Our technique has been tested on an MP3 audio decoder. Results show 45.8% decrease in the number of main memory accesses which requires a 16.2% increase of code size.