MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Performance of image and video processing with general-purpose processors and media ISA extensions
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
On-chip vs. off-chip memory: the data partitioning problem in embedded processor-based systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Bottlenecks in Multimedia Processing with SIMD Style Extensions and Architectural Enhancements
IEEE Transactions on Computers
Vectorizing for a SIMdD DSP architecture
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Vectorization for SIMD architectures with alignment constraints
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
Auto-vectorization of interleaved data for SIMD
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
Dynamic allocation for scratch-pad memory using compile-time decisions
ACM Transactions on Embedded Computing Systems (TECS)
A compiler-based approach for dynamically managing scratch-pad memories in embedded systems
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
A fast hierarchical motion vector estimation algorithm using mean pyramid
IEEE Transactions on Circuits and Systems for Video Technology
Hi-index | 0.03 |
The number of cycles for each external memory access in Single Instruction Multiple Data (SIMD) processors is heavily affected by the access pattern, such as aligned, unaligned, or stride. We developed a high-performance dynamic on-chip memory-allocation method for SIMD processors by considering the memory access pattern as well as the access frequency. The access pattern and the access count for an array of a loop are determined by both code analysis and profiling, which are performed on a developed compiler framework. This framework not only conducts dynamic on-chip memory allocation but also generates optimized codes for a target processor. The proposed allocation method has been tested with several multimedia benchmarks including motion estimation, 2-D discrete cosine transform, and MPEG2 encoder programs.