Priority coding for video-telephony applications based on visual attention

Authors:
Nicolas Tsapatsoulis;Konstantinos Rapantzikos;Yannis Avrithis
Affiliations:
University of Cyprus, Cyprus;National Technical University of Athens, Greece;National Technical University of Athens, Greece
Venue:
MobiMedia '06 Proceedings of the 2nd international conference on Mobile multimedia communications
Year:
2006

Citing 9
Cited 0

A Theory for Multiresolution Signal Decomposition: The Wavelet Representation

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Model of Saliency-Based Visual Attention for Rapid Scene Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
Perceptual quality metrics applied to still image compression

Signal Processing - Special issue on image and video quality metrics
Algorithms for Defining Visual Regions-of-Interest: Comparison with Eye Fixations

IEEE Transactions on Pattern Analysis and Machine Intelligence
Digital Image Processing (3rd Edition)

Digital Image Processing (3rd Edition)
Facial feature tracking and pose estimation in video sequences by factorial coding of the low-dimensional entropy manifolds due to the partial symmetries of faces

ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 04
Is bottom-up attention useful for object recognition?

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Foveation scalable video coding with automatic fixation selection

IEEE Transactions on Image Processing
Automatic foveation for video compression using a neurobiological model of visual attention

IEEE Transactions on Image Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we investigate the utilization of visual saliency maps for ROI-based video coding of video-telephony applications. Visually salient areas indicated in the saliency map are considered as ROIs. These areas are automatically detected using an algorithm for visual attention (VA) which builds on the bottom-up approach proposed by Itti et al. A top-down channel emulating the visual search for human faces performed by humans has been added, while orientation, intensity and color conspicuity maps are computed within a unified multi-resolution framework based on wavelet subband analysis. Priority encoding, for experimentation purposes, is utilized in a simple manner: Frame areas outside the priority regions are blurred using a smoothing filter and then passed to the video encoder. This leads to better compression of both Intra-coded (I) frames (more DCT coefficients are zeroed in the DCT-quantization step) and Inter coded (P, B) frames (lower prediction error). In more sophisticated approaches, priority encoding could be incorporated by varying the quality factor of the DCT quantization table. Extended experiments concerning both static images as well as low-quality video show the compression efficiency of the proposed method. The comparisons are made against standard JPEG and MPEG-1 encoding respectively.