ACM Computing Surveys (CSUR)
Making a Long Video Short: Dynamic Video Synopsis
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 1
Appearance-based video clustering in 2D locality preserving projection subspace
Proceedings of the 6th ACM international conference on Image and video retrieval
Activity based surveillance video content modelling
Pattern Recognition
Systematic evaluation of logical story unit segmentation
IEEE Transactions on Multimedia
Affective video content representation and modeling
IEEE Transactions on Multimedia
Hi-index | 0.00 |
More and more cameras are being installed everyday for safety, security and intelligence gathering purposes, making the volume of storage videos increase all the time. It is therefore important to manage this resource to be able to cast a structured (hierarchical) view into the activities of long video files to catalogue only interesting or relevant domain events. This paper aims to address this issue by proposing a novel and efficient computational approach to ascertaining semantic segmentation of scene activities exhibited in monocular or multi-view surveillance videos. The key to achieve this is to derive the so-called ‘pace’ descriptor, reflecting the change in underlying scene activities of a surveillance site, based on detecting key scene frames and modeling its temporal distribution. The former is performed by extracting 2D or 3D appearance-based subspace embedding features, followed by a time-constrained agglomerative data clustering. The latter models the density of such key frames distribution in the time domain, and then applies a visual curve segmentation algorithm to identify scene segments of different activities. The approach is especially suited for crowd scene segmentation, and it has been evaluated with real-world surveillance videos of both underground platforms and a busy industrial park entrance in rush hours with promising results.