Batch-Pipelining for H.264 Decoding on Multicore Systems

  • Authors:
  • Tang-Hsun Tu;Chih-Wen Hsueh

  • Affiliations:
  • -;-

  • Venue:
  • DCC '10 Proceedings of the 2010 Data Compression Conference
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Pipelining has been applied in many area to improve performance by overlapping executions of computing stages. However, it is difficult to apply on H.264/AVC decoding in frame level, because the bitstreams are encoded with lots of dependencies and little parallelism is left to be explored. Therefore, many researches can only adopt hardware assistance. Fortunately, pure software pipelining can be applied on H.264/AVC decoding in macroblock level with reasonable performance gain. However, the pipeline stages might need to synchronize with other stages and incur lots of extra overhead. Moreover, the overhead becomes relatively larger as the stages themselves are executed faster with better hardware and software optimization. We first group multiple stages into larger groups as ”batched” pipelining to execute concurrently in multicore systems. The stages in different groups might not need to synchronize to each other so that it incurs little overhead and can be highly scalable. Therefore, a novel effective batch-pipeline (BP) approach for H.264/AVC decoding on multicore systems is proposed. Moreover, because of its flexibility, BP can be used with other hardware approaches or software technologies to further improve performance. To optimize our approach, we analyze how to group the macroblocks and derive close-form formulas to guide the grouping. We also conduct various experiments on various bitstreams to verify our approach. The results show that it can speed up to 93% and achieve up to 249 and 70 FPS for 720P and 1080P resolutions, respectively, on a 4-core machine over a published optimized H.264 decoder.We believe our batch-pipelining approach creates a new effective direction for multimedia software codec development.