Memory Performance Optimizations For Real-Time Software HDTV Decoding

  • Authors:
  • Han Chen;Kai Li;Bin Wei

  • Affiliations:
  • IBM TJ Watson Research Center, NY, USA 10532;Princeton University, Princeton, USA 08544;AT&T Labs Research, Florham Park, USA 07932

  • Venue:
  • Journal of VLSI Signal Processing Systems
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Pure software HDTV video decoding is still a challenging task on entry-level to mid-range desktop and notebook PCs, even with today's microprocessors frequency measured in GHz. This paper shows that the performance bottleneck in a software MPEG-2 decoder has been shifted to memory operations, as microprocessor technologies including multimedia instruction extensions have been improving at a fast rate during the past years.Our study exploits concurrencies at macroblock level to alleviate the performance bottleneck in a software MPEG-2 decoder. First, the paper introduces an interleaved block-order data layout to improve CPU cache performance. Second, the paper describes an algorithm to explicitly prefetch macroblocks for motion compensation. Finally, the paper presents an algorithm to schedule interleaved decoding and output at macroblock level. Our implementation and experiments show that these methods can effectively hide the latency of memory and frame buffer. The optimizations improve the performance of a multimedia-instruction-optimized software MPEG-2 decoder by a factor of about two. On a PC with a 933 MHz Pentium III CPU, the decoder can decode and display 1280 脳 720-resolution HDTV streams at over 62 frames per second.