An efficient fully unsupervised video object segmentation scheme using an adaptive neural-network classifier architecture

Authors:
A. Doulamis;N. Doulamis;K. Ntalianis;S. Kollias
Affiliations:
Electr. & Comput. Eng. Dept., Nat. Tech. Univ. of Athens, Greece;-;-;-
Venue:
IEEE Transactions on Neural Networks
Year:
2003

Citing 0
Cited 9

An automatic human video objects encryption scheme built on stream and block ciphers and based on chaos

ICS'05 Proceedings of the 9th WSEAS International Conference on Systems
Applying the multi-category learning to multiple video object extraction

Pattern Recognition
An architecture for a self configurable video supervision

AREA '08 Proceedings of the 1st ACM workshop on Analysis and retrieval of events/actions and workflows in video streams
Adaptable Neural Networks for Objects' Tracking Re-initialization

ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part II
Dynamic tracking re-adjustment: a method for automatic tracking recovery in complex visual environments

Multimedia Tools and Applications
Video representation and retrieval using spatio-temporal descriptors and region relations

ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part II
Video object watermarking based on moments

VLBV'05 Proceedings of the 9th international conference on Visual Content Processing and Representation
Towards space-time semantics in two frames

ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume Part III
Fast and adaptive deep fusion learning for detecting visual objects

ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume Part III

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, an unsupervised video object (VO) segmentation and tracking algorithm is proposed based on an adaptable neural-network architecture. The proposed scheme comprises: 1) a VO tracking module and 2) an initial VO estimation module. Object tracking is handled as a classification problem and implemented through an adaptive network classifier, which provides better results compared to conventional motion-based tracking algorithms. Network adaptation is accomplished through an efficient and cost effective weight updating algorithm, providing a minimum degradation of the previous network knowledge and taking into account the current content conditions. A retraining set is constructed and used for this purpose based on initial VO estimation results. Two different scenarios are investigated. The first concerns extraction of human entities in video conferencing applications, while the second exploits depth information to identify generic VOs in stereoscopic video sequences. Human face/ body detection based on Gaussian distributions is accomplished in the first scenario, while segmentation fusion is obtained using color and depth information in the second scenario. A decision mechanism is also incorporated to detect time instances for weight updating. Experimental results and comparisons indicate the good performance of the proposed scheme even in sequences with complicated content (object bending, occlusion).