Interaction between high-level and low-level image analysis for semantic video object extraction

Authors:
Andrea Cavallaro;Touradj Ebrahimi
Affiliations:
Multimedia and Vision Laboratory, Queen Mary University of London (QMUL), London, UK;Signal Processing Institute, Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland
Venue:
EURASIP Journal on Applied Signal Processing
Year:
2004

Citing 14
Cited 5

Phase-based disparity measurement

CVGIP: Image Understanding
Statistical model-based change detection in moving video

Signal Processing
Performance of optical flow techniques

International Journal of Computer Vision
The role of analysis in content-based video coding and indexing

Signal Processing - Video segmentation for content-based processing manipulation
A noise robust method for 2D shape estimation of moving objects in video sequences considering a moving camera

Signal Processing - Video segmentation for content-based processing manipulation
Automatic moving object and background separation

Signal Processing - Video segmentation for content-based processing manipulation
Representation and recognition in vision

Representation and recognition in vision
Object Tracking with Bayesian Estimation of Dynamic Layer Representations

IEEE Transactions on Pattern Analysis and Machine Intelligence
Nonlinear Model-Based Image/Video Processing and Analysis

Nonlinear Model-Based Image/Video Processing and Analysis
Multiple video object tracking in complex scenes

Proceedings of the tenth ACM international conference on Multimedia
A VOP generation tool: automatic segmentation of moving objects in image sequences based on spatio-temporal information

IEEE Transactions on Circuits and Systems for Video Technology
Fast and automatic video object segmentation and tracking for content-based applications

IEEE Transactions on Circuits and Systems for Video Technology
Efficient moving object segmentation algorithm using background registration technique

IEEE Transactions on Circuits and Systems for Video Technology
Automatic segmentation of moving objects in video sequences: a region labeling approach

IEEE Transactions on Circuits and Systems for Video Technology

Detection and tracking of humans and faces

Journal on Image and Video Processing - Regular
Objective Evaluation of Pedestrian and Vehicle Tracking on the CLEAR Surveillance Dataset

Multimodal Technologies for Perception of Humans
Learning scene context for multiple object tracking

IEEE Transactions on Image Processing
Multi-feature graph-based object tracking

CLEAR'06 Proceedings of the 1st international evaluation conference on Classification of events, activities and relationships
Event detection in underground stations using multiple heterogeneous surveillance cameras

ISVC'05 Proceedings of the First international conference on Advances in Visual Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The task of extracting a semantic video object is split into two subproblems, namely, object segmentation and region segmentation. Object segmentation relies on a priori assumptions, whereas region segmentation is data-driven and can be solved in an automatic manner. These two subproblems are not mutually independent, and they can benefit from interactions with each other. In this paper, a framework for such interaction is formulated. This representation scheme based on region segmentation and semantic segmentation is compatible with the view that image analysis and scene understanding problems can be decomposed into low-level and high-level tasks. Low-level tasks pertain to region-oriented processing, whereas the high-level tasks are closely related to object-level processing. This approach emulates the human visual system: what one "sees" in a scene depends on the scene itself (region segmentation) as well as on the cognitive task (semantic segmentation) at hand. The higher-level segmentation results in a partition corresponding to semantic video objects. Semantic video objects do not usually have invariant physical properties and the definition depends on the application. Hence, the definition incorporates complex domain-specific knowledge and is not easy to generalize. For the specific implementation used in this paper, motion is used as a clue to semantic information. In this framework, an automatic algorithm is presented for computing the semantic partition based on color change detection. The change detection strategy is designed to be immune to the sensor noise and local illumination variations. The lower-level segmentation identifies the partition corresponding to perceptually uniform regions. These regions are derived by clustering in an N-dimensional feature space, composed of static as well as dynamic image attributes. We propose an interaction mechanism between the semantic and the region partitions which allows to cope with multiple simultaneous objects. Experimental results show that the proposed method extracts semantic video objects with high spatial accuracy and temporal coherence.