Boundary macroblock padding in MPEG-4 video decoding using a graphics coprocessor

  • Authors:
  • R. Garg;C. Y. Chung;Donglok Kim;Yongmin Kim

  • Affiliations:
  • Dept. of Electr. Eng. & Bioeng., Washington Univ., Seattle, WA;-;-;-

  • Venue:
  • IEEE Transactions on Circuits and Systems for Video Technology
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

MPEG-4 is the latest multimedia coding standard that supports object-based coding and manipulation of natural video and synthetic graphics objects. Due to its various features and high coding efficiency, MPEG-4 is becoming popular in video streaming applications. Many graphics coprocessors provide the acceleration of inverse discrete cosine transform (IDCT) and motion compensation for real-time video decoding. Therefore, it is desired to use the graphics coprocessors to accelerate MPEG-4 video decoding as well. Since MPEG-4 video decoding for rectangular video objects is similar to other video coding standards, e.g., MPEG-2, the IDCT and motion compensation can still be executed on the graphics coprocessors. However, we have found that boundary macroblock padding, which is an essential processing step in decoding arbitrarily shaped video objects, could not be efficiently accelerated on the graphics coprocessors due to its complexity. Although we can implement the boundary macroblock padding on the host processor, the frame data processed on the graphics coprocessor need to be transferred to the host processor for padding. In addition, the padded data on the host processor need to be sent back to the graphics coprocessor to be used as a reference for subsequent frames. To avoid this overhead, we present two approaches of boundary macroblock padding. In the first approach, the boundary macroblock padding is partitioned into two tasks, one of which the host processor can perform without the overhead of data transfers. In the second approach, we propose two new instructions and an algorithm that can be easily adopted in the next-generation graphics coprocessors or mediaprocessors, which gives a performance improvement of up to a factor of nine compared to that with the Pentium III.