A Hybrid Object-Oriented Very Low Bit Rate Video Codec

  • Authors:
  • Taner Özcelik;Aggelos K. Katsaggelos

  • Affiliations:
  • Sony Electronics, Inc., Consumer Audio/Video/Data, 3300 Zanker Road, San Jose, CA 95134, USA;Northwestern University, Department of Electrical and Computer Engineering, Evanston, IL, 60208-3118, USA

  • Venue:
  • Journal of VLSI Signal Processing Systems - Special issue on recent development in video: algorithms, implementation and applications
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

There are a large number of applications requiring the compressionof video at Very Low Bit Rates (VLBR). Such applications include wirelessvideo conferencing, video over the internet, multimedia database retrievaland remote sensing and monitoring. Recently, the MPEG-4 standardizationeffort has been a motivating factor to find a solution to this challengingproblem. The existing approaches to this problem can generally be groupedinto block-based, model-based, and object-oriented. Block-based approachesfollow the traditional strategy of decoupling the image sequence intoblocks, model-based approaches rely on complex 3-D models for specificobjects that are encoded, and object-oriented approaches rely on analyzingthe scene into differently moving objects. All three approaches exhibitpotential problems. Block-based approaches tend to generate artifacts at theboundaries of the blocks, as well as to limit the minimum achievablebit-rate due to the fixed analysis structure of the scene. Model-basedcodecs are limited by the complex 3-D models of the objects to be encoded.On the other hand, object-oriented codecs can generate a significantoverhead due to the analysis of the scene which needs to be transmitted,which in turn can be the limiting factor in achieving the target bit-rates.In this paper, we propose a hybrid object-oriented codec in which thecorrelations among the three information fields, e.g., motion, segmentationand intensity fields, are exploited both spatially and temporally. In theproposed method, additional intelligence is given to the decoder, resultingin a reduction of the required bandwidth. The residual information isanalyzed into three different categories, i.e., occlusion, model failures,and global refinement. The residual information is encoded and transmittedacross the channel with other side information. Experimental results arepresented which demonstrate the effectiveness of the proposed approach.