Fast perceptual region tracking with coding-depth sensitive access for stream transcoding

  • Authors:
  • Javed I. Khan;Zhong Guo

  • Affiliations:
  • Media Communications and Networking Research Laboratory, Department of Computer Science, Kent State University, Kent, 233 MSB, OH 44242, USA;Media Communications and Networking Research Laboratory, Department of Computer Science, Kent State University, Kent, 233 MSB, OH 44242, USA

  • Venue:
  • Journal of Visual Communication and Image Representation
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Object-based bit allocation can result in significant improvement in the perceptual quality of extremely compressed video. However, real-time video object detection in large format high fidelity video is computationally daunting. Most algorithms begin with extensive use of classical bit analysis, and thus remain computationally heavy. Based on some recent results in human visual perception, in this paper, we present an experimental visual region tracking algorithm particularly designed for perceptual stream transcoding. This exploits the cue order observed in human visual perception to achieve very high computation speed as well as tracking efficiency. Rather than begin processing from pixel level or using any pixel level processing at all, it employs high level motion cue and block shape cue analysis to identify signatures of various relative movements between object of interest, scene background and the camera on the motion vector set, and from there it identifies objects. It then uses predictive filters to track the regions. The result is a fast yet highly effective perceptual region tracking algorithm that can operate in stream rate and track regions of perceptually significant object despite camera movements such as zoom, panning and translation. The technique is not specific to any special class of objects. We have implemented this algorithm in a live ISO-13818/MPEG-2 perceptual transcoder. In this paper, we share the performance of this implementation. This fast object-aware video rate transcoder is particularly suitable for live streaming and can convert a regular stream into a perceptually coded video stream.