Optimal content-based video decomposition for interactive video navigation

Authors:
A. D. Doulamis;N. D. Doulamis
Affiliations:
Dept. of Electr. & Comput. Eng., Nat. Tech. Univ. of Athens, Greece;-
Venue:
IEEE Transactions on Circuits and Systems for Video Technology
Year:
2004

Citing 0
Cited 12

Optimal decomposition of P2P networks based on file exchange patterns for multimedia content search & replication

Proceedings of the international workshop on Workshop on multimedia information retrieval
MI-MERCURY: A mobile agent architecture for ubiquitous retrieval and delivery of multimedia information

Multimedia Tools and Applications
A secure framework exploiting content guided and automated algorithms for real time video searching

Multimedia Tools and Applications
Content-based attention ranking using visual and contextual attention model for baseball videos

IEEE Transactions on Multimedia - Special issue on integration of context and content
Hierarchical graph-based media content representation for real time search in large scale multimedia databases

SIP '07 Proceedings of the Ninth IASTED International Conference on Signal and Image Processing
A motion-based scene tree for compressed video content management

Image and Vision Computing
Video indexing and retrieval in compressed domain using fuzzy-categorization

ISVC'06 Proceedings of the Second international conference on Advances in Visual Computing - Volume Part II
Swift: reducing the effects of latency in online video scrubbing

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Hierarchical graph-based media content representation for real time search in large scale multmedia databases

Machine Graphics & Vision International Journal - Special issue on Image Databases
Video abstraction based on the visual attention model and online clustering

Image Communication
Swifter: improved online video scrubbing

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
VideoSkip: event detection in social web videos with an implicit user heuristic

Multimedia Tools and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, an interactive framework for navigating video sequences is presented using an optimal content-based video decomposition scheme. In particular, each video sequence is analyzed at different content resolution levels, creating a hierarchy from the lowest (coarse) to the highest (fine) resolution. This content hierarchy is represented as a tree structure, each level of which corresponds to a particular content resolution, while the tree nodes indicate the temporal video segments that the sequence content is partitioned at a given resolution. A criterion is introduced to measure the efficiency of the proposed scheme in organizing the video visual content and to compare it with other hierarchical video content representations and navigation schemes. The efficiency is measured as the difficulty for a user to locate a video segment of interest, while moving through different levels of hierarchy. In our case, video is decomposed so that the best efficiency is accomplished. However, the efficiency of a nonlinear video decomposition scheme depends on: 1) the number of paths required for a user to locate a relevant video segment and 2) the number of shot/frame classes (i.e., content representatives) extracted to represent the visual content. Both issues are addressed in this paper. In the first case, the probability of selecting a relevant video segment in the first path is maximized by extracting optimal content representatives through a minimization of a cross-correlation criterion. For the minimization, a genetic algorithm (GA) is adopted, since application of an exhaustive search to obtain the minimum value is too large to be implemented. The cross-correlation criterion is evaluated on the feature domain by extracting appropriate global and object-based descriptors for each video frame so that a better representation of the visual content is achieved. The second aspect (e.g., the number of content representatives) is addressed by minimizing the average transmitted information and simultaneously taking into consideration the temporal video segment complexity. More content representatives are extracted for video segments of high complexity, whereas a low number is required for low-complexity segments. In addition, a degree of interest is assigned to each- video shot (or frame) to address the fact that, from the user's perception, the visual content of a set of shots (frames) satisfies his/her information needs. Finally, a computationally efficient algorithm is proposed to regulate the degree of detail (i.e., the number of shot/frames representatives) in case the visual content is not efficiently represented from the user's perceptive view. Experimental results on real-life video sequences indicate the performance of the proposed GA-based video decomposition scheme compared to other hierarchical video organization methods.