An efficient parallelization technique for x264 encoder on heterogeneous platforms consisting of CPUs and GPUs

  • Authors:
  • Youngsub Ko;Youngmin Yi;Soonhoi Ha

  • Affiliations:
  • School of EECS, Seoul National University, Seoul, Korea;School of ECE, University of Seoul, Seoul, Korea;School of EECS, Seoul National University, Seoul, Korea

  • Venue:
  • Journal of Real-Time Image Processing
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

H.264/AVC video encoders have been widely used for its high coding efficiency. Since the computational demand proportional to the frame resolution is constantly increasing, it has been of great interest to accelerate H.264/AVC by parallel processing. Recently, graphics processing units (GPUs) have emerged as a viable target for accelerating general purpose applications by exploiting fine-grain data parallelisms. Despite extensive research efforts to use GPUs to accelerate the H.264/AVC algorithm, it has not been successful to achieve any speed-up over the x264 algorithm that is known as the fastest CPU implementation, mainly due to significant communication overhead between the host CPU and the GPU and intra-frame dependency in the algorithm. In this paper, we propose a novel motion-estimation (ME) algorithm tailored for NVIDIA GPU implementation. It is accompanied by a novel pipelining technique, called sub-frame ME processing, to effectively hide the communication overhead between the host CPU and the GPU. Further, we incorporate frame-level parallelization technique to improve the overall throughput. Experimental results show that our proposed H.264 encoder has higher performance than x264 encoder.