Non-sequential multiscale content-based video decomposition

  • Authors:
  • Nikolaos Doulamis;Anastasios Doulamis

  • Affiliations:
  • Electrical and Computer Engineering Department, Computer Science Division, National Technical University of Athens, 11.23 office, 9, Heroon Polytechniou Street, Zografou 15773, Athens, Greece;Electrical and Computer Engineering Department, Computer Science Division, National Technical University of Athens, 11.23 office, 9, Heroon Polytechniou Street, Zografou 15773, Athens, Greece

  • Venue:
  • Signal Processing - Special section on content-based image and video retrieval
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, a multiscale content-based video decomposition scheme is presented for efficient non-linear (nonsequential) organization of the video visual content. In particular, each video file is analyzed in a multiscale structure of different "content resolution levels", creating a hierarchy from the lowest (coarse) to the highest (fine) resolution. The scheme resembles the progressive transmission of still images, where instead of transmitting the image sequentially at a full resolution, by scanning it line by line, a lower image resolution is first delivered and then, the image quality gradually enhances so that the user is able at any time to see a preview of the image content. The proposed video decomposition is represented as a graph structure, each level of which corresponds to a particular content resolution, while the graph-nodes the respective regions that the content is analyzed at this level. Transitions among nodes of the same level are also permitted. The number of nodes at a given level expresses the degree of detail that the content at this level is analyzed. This number is estimated by minimizing the average transmitted information, required for localizing a video segment of interest and also takes into account the content complexity.Quality criteria are introduced to evaluate the efficiency of the proposed scheme. The efficiency of the organization is maximized if multiscale content decomposition is performed using content representatives and constructing content classes. Content representatives are estimated in our approach as the ones of the maximum dissimilarity, expressed by a distance metric. The optimization is conducted by incorporating a stochastic algorithm of logarithmically reduced searching area (stochastic logarithmic). Experimental results on real-life video sequences show that the proposed multiscale video organization enables users to detect content of interest much faster, compared to the conventional sequential video scanning or other video decomposition/summarization methods, resulting in a better organization efficiency as measured by the quality criteria.