Compiler transformations for high-performance computing
ACM Computing Surveys (CSUR)
Formalized methodology for data reuse exploration for low-power hierarchical memory mappings
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Data and memory optimization techniques for embedded systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Reducing memory requirements of nested loops for embedded systems
Proceedings of the 38th annual Design Automation Conference
Image and Video Compression Standards: Algorithms and Architectures
Image and Video Compression Standards: Algorithms and Architectures
Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design
Algorithms, Complexity Analysis and VLSI Architectures for MPEG-4 Motion Estimation
Algorithms, Complexity Analysis and VLSI Architectures for MPEG-4 Motion Estimation
Power Aware Design Methodologies
Power Aware Design Methodologies
EAC: A Compiler Framework for High-Level Energy Estimation and Optimization
Proceedings of the conference on Design, automation and test in Europe
WVLSI '01 Proceedings of the IEEE Computer Society Workshop on VLSI 2001
Integrated Computer-Aided Engineering
A fast hierarchical motion vector estimation algorithm using mean pyramid
IEEE Transactions on Circuits and Systems for Video Technology
Hi-index | 0.00 |
The continuous increase of the computational power of programmable processors has established them as an attractive design alternative, for implementation of the most computationally intensive applications, like video compression. To enforce this trend, designers implementing applications on programmable platforms have to be provided with reliable and in-depth data and instruction analysis that will allow for the early selection of the most appropriate application for a given set of specifications. To address this need, we introduce a new methodology for early and accurate estimation of the number of instructions required for the execution of an application, together with the number of data memory transfers on a programmable processor. The high-level estimation is achieved by a series of mathematical formulas; these describe not only the arithmetic operations of an application, but also its control and addressing operations, if it is executed on a programmable core. The comparative study, which is done using three popular processors (ARM, MIPS, and Pentium), shows the high efficiency and accuracy of the methodology proposed, in terms of the number of executed (micro-)instructions (i.e. performance) and the number of data memory transfers (i.e. memory power consumption). Using the proposed methodology we estimated an average deviation of 23% in our estimated figures compared with the measurements taken from the real execution on the CPUs.