Attention-based video streaming

Authors:
Çağatay Dikici;H. Işıl Bozma
Affiliations:
Intelligent Systems Laboratory, Electric Electronic Engineering Department, Bogaziçi University, Istanbul, Turkey;Intelligent Systems Laboratory, Electric Electronic Engineering Department, Bogaziçi University, Istanbul, Turkey
Venue:
Image Communication
Year:
2010

Citing 11
Cited 1

Principles of animate vision

CVGIP: Image Understanding - Special issue on purposive, qualitative, active vision
Fundamentals of neural networks: architectures, algorithms, and applications

Fundamentals of neural networks: architectures, algorithms, and applications
The psychophysics of texture segmentation

Early vision and beyond
Two-dimensional and three-dimensional texture processing in visual cortex of the macaque monkey

Early vision and beyond
Attention to surfaces: beyond a Cartesian understanding of focal attention

Early vision and beyond
Real-time simulation of arbitrary visual fields

ETRA '02 Proceedings of the 2002 symposium on Eye tracking research & applications
Neural Networks: A Comprehensive Foundation

Neural Networks: A Comprehensive Foundation
Biologically inspired Cartesian and non-Cartesian filters for attentional sequences

Pattern Recognition Letters
Foveation scalable video coding with automatic fixation selection

IEEE Transactions on Image Processing
Image quality assessment: from error visibility to structural similarity

IEEE Transactions on Image Processing
Automatic foveation for video compression using a neurobiological model of visual attention

IEEE Transactions on Image Processing

Special Section on 3D Object Retrieval: Efficient 3D object recognition using foveated point clouds

Computers and Graphics

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper considers the problem of video streaming in low bandwidth networks and presents a complete framework that is inspired by the fovea-periphery distinction of biological vision systems. First, an application specific attention function that serves to find the important small regions in a given frame is constructed a priori using a back-propagation neural network that is optimized combinatorially. Given a specific application, the respective attention function partitions each frame into foveal and periphery regions and then a spatial-temporal pre-processing algorithm encodes the foveal regions with high spatial resolution while the periphery regions are encoded with lower spatial and temporal resolution. Finally, the pre-processed video sequence is streamed using a standard streaming server. As an application, we consider the transmission of human face videos. Our experimental results indicate that even with limited amount of training, the constructed attention function is able to determine the foveal regions which have improved transmission quality while the peripheral regions have an acceptable degradation.