A Novel Macro-Block Group Based AVS Coding Scheme for Many-Core Processor

  • Authors:
  • Zhenyu Wang;Luhong Liang;Guolei Yang;Xianguo Zhang;Jun Sun;Debin Zhao;Wen Gao

  • Affiliations:
  • Peking University, Beijing, China 100871;Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China 100190;Peking University, Beijing, China 100871;Peking University, Beijing, China 100871;Peking University, Beijing, China 100871;Harbin Institute of Technology, Harbin, China 150001;Peking University, Beijing, China 100871

  • Venue:
  • Journal of Signal Processing Systems
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Implementation of video coding systems such as H.264/AVC and AVS on multi-core and many-core platforms is attracting much attention. The slice-level parallelism is popular in parallel video coding for its simplicity and flexibility, however, the video quality loses greatly since the partitioning of slices breaks the dependency between macro-blocks, especially on multi-core or many-core platforms. To address this problem, we propose a Macro-Block Group (MBG) parallel scheme for parallel AVS coding. In the proposed scheme, video frames are equally divided into rectangular MBG regions; each MBG consists of more rows and less columns of macro-blocks than the slice-level scheme. Given that MBG is not syntactically supported by AVS, a vertical partitioning scheme is introduced. Additionally, we use mode confining and motion vector difference adjusting techniques to keep consistent with the standard. Two MBG parallel schemes (5驴脳驴9 MBG partition and 8驴脳驴7 MBG partition) are developed on a TILE64 many-core platform, where P/B frames use the MBG parallel scheme and I frames use the macro-block-level parallelism. Experimental results show that the proposed scheme of 5驴脳驴9 MBG partition can achieve a reduction of 52% (IPPP) and 41% (IBBP) quality loss while keeping the same speed-up compared with the slice-level parallelism. With more cores employed, the scheme of 8驴脳驴7 MBG partition gains 23.9 times of speed-up compared with the single-core implementation and achieves similar coding performance as the 5驴脳驴9 scheme.