Scalable Parallel Memory Architectures for Video Coding

Authors:
Jarno K. Tanskanen;Jarkko T. Niittylahti
Affiliations:
Tampere University of Technology, Department of Information Technology, Institute of Digital and Computer Systems, P.O. Box 553, FIN-33101 Tampere, Finland;Tampere University of Technology, Department of Information Technology, Institute of Digital and Computer Systems, P.O. Box 553, FIN-33101 Tampere, Finland
Venue:
Journal of VLSI Signal Processing Systems
Year:
2004

Citing 8
Cited 2

DSP Processor Fundamentals: Architectures and Features

DSP Processor Fundamentals: Architectures and Features
Subword Parallelism with MAX-2

IEEE Micro
Subword Permutation Instructions for Two-Dimensional Multimedia Processing in MicroSIMD Architectures

ASAP '00 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures, and Processors
A Register File with Transposed Access Mode

ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
On Design of Parallel Memory Access Schemes for Video Coding

Journal of VLSI Signal Processing Systems
MMX-Based DCT and MC Algorithms for Real-Time Pure Software MPEG Decoding

ICMCS '99 Proceedings of the IEEE International Conference on Multimedia Computing and Systems - Volume 2
Architecture and applications of the HiPAR video signal processor

IEEE Transactions on Circuits and Systems for Video Technology
A design study of a 0.25-μm video signal processor

IEEE Transactions on Circuits and Systems for Video Technology

On Design of Parallel Memory Access Schemes for Video Coding

Journal of VLSI Signal Processing Systems
Configurable data memory for multimedia processing

Journal of Signal Processing Systems - Special Issue: Embedded computing systems for DSP

Quantified Score

Hi-index	0.00

Visualization

Abstract

Current video compression standards, which process frames macroblock by macroblock, employ several processing functions to achieve the compression. These functions refer to data memory address space in different ways. E.g., performing motion estimation and motion compensation functions requires many times data accesses unaligned to word boundaries. On the other hand, Discrete Cosine Transformation (DCT) and inverse of it (IDCT) for 8 × 8 block can be performed first for rows and then for columns. Thus, transposition is needed between these two stages. Among other things, parallel memory architecture can provide a solution for these tasks. In our other paper, we shortly surveyed parallel memory architectures and proposed parallel memory architecture designs for different data path widths for video coding applications. In this paper, we construct video coding function examples by using the proposed parallel data memory efficiently. Furthermore, performance and implementation cost of the parallel memory architecture are estimated and compared to more conventional memory architectures. The examples are given for different data bus widths (16, 32, 64, and 128 bits). We show that the parallel memory can keep the data path fully utilized in many video coding function implementations. This ensures high-speed operation and full utilization of the processing resources.