Scalable Parallel Memory Architectures for Video Coding
Journal of VLSI Signal Processing Systems
Matrix register file and extended subwords: two techniques for embedded media processors
Proceedings of the 2nd conference on Computing frontiers
Avoiding conversion and rearrangement overhead in SIMD architectures
International Journal of Parallel Programming
Hi-index | 0.00 |
We introduce a new register file architecture that provides both row-wise and column-wise accesses, thus allowing partitioned instructions to be used in column-wise processing without transposition overhead. This feature can accelerate 2D separable image and video processing algorithms, such as 2D convolution and 2D discrete cosine transform (DCT), by eliminating the transposition steps.