Video Coding With Rate-Distortion Optimized Transform

  • Authors:
  • Xin Zhao;Li Zhang;Siwei Ma;Wen Gao

  • Affiliations:
  • Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China;Institute of Digital Media, School of Electronic Engineering and Computer Science, Peking University, Beijing, China;Institute of Digital Media, School of Electronic Engineering and Computer Science, Peking University, Beijing, China;Institute of Digital Media, School of Electronic Engineering and Computer Science, Peking University, Beijing, China

  • Venue:
  • IEEE Transactions on Circuits and Systems for Video Technology
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Block-based discrete cosine transform (DCT) has been successfully adopted into several international image/video coding standards, e.g., MPEG-2, H.264/AVC, as it can achieve a good tradeoff between performance and complexity. Although DCT theoretically approximates the optimum Karhunen–Loève transform under first-order Markov conditions, one fixed set of transform basis functions (TBF) cannot handle all the cases efficiently due to the non-stationary nature of video contents. To further improve the performance of block-based transform coding, in this paper, we present the design of rate-distortion optimized transform (RDOT) which contributes to both intraframe and interframe coding. The most important property which makes a difference between RDOT and the conventional DCT is that, in the proposed method, transform is implemented with multiple TBF candidates which are obtained from off-line training. With this feature, for coding each residual block, the encoder is capable to select the optimal set of TBF in terms of rate-distortion performance, and better energy compaction is achieved in the transform domain. To obtain an optimum group of candidate TBF, we have developed a two-step iterative optimization technique for the off-line training, with which the TBF candidates are refined at each iteration until the training process becomes converged. Moreover, analysis on the optimal group of candidate TBF is also presented in this paper, with a detailed description of a practical implementation for the proposed algorithm on the latest VCEG key technical area software platform. Extensive experimental results show that, compared with the conventional DCT-based transform scheme adopted into the state-of-the-art H.264/AVC video coding standard, significant improvement of coding performance has been achieved for both intraframe and interframe coding with our proposed method.