Determining computable scenes in films and their structures using audio-visual memory models
MULTIMEDIA '00 Proceedings of the eighth ACM international conference on Multimedia
A robust audio classification and segmentation method
MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
Construction and Evaluation of a Robust Multifeature Speech/Music Discriminator
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Shot clustering techniques for story browsing
IEEE Transactions on Multimedia
Automated high-level movie segmentation for advanced video-retrieval systems
IEEE Transactions on Circuits and Systems for Video Technology
Automatically selecting shots for action movie trailers
MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Associating characters with events in films
Proceedings of the 6th ACM international conference on Image and video retrieval
Indexing of fictional video content for event detection and summarisation
Journal on Image and Video Processing
The state of the art in image and video retrieval
CIVR'03 Proceedings of the 2nd international conference on Image and video retrieval
Multimodal data fusion for video scene segmentation
VISUAL'05 Proceedings of the 8th international conference on Visual Information and Information Systems
A system for event-based film browsing
TIDSE'06 Proceedings of the Third international conference on Technologies for Interactive Digital Storytelling and Entertainment
Hi-index | 0.00 |
Content-based video retrieval requires an effective scene segmentation technique to divide a long video file into meaningful high-level aggregates of shots called scenes. Each scene is part of a story. Browsing these scenes unfolds the entire story of a film. In this paper, we first investigate recent scene segmentation techniques that belong to the visual-audio alignment approach. This approach segments a video stream into visual scenes and an audio stream into audio scenes separately and later aligns these boundaries to create the final scene boundaries. In contrast, we propose a novel audio-assisted scene segmentation technique that utilizes audio information to remove false boundaries generated from segmentation by visual information alone. The crux of our technique is the new dissimilarity measure based on analysis of statistical properties of audio features and a concept in information theory. The experimental results on two full-length films with a wide range of camera motion and a complex composition of shots demonstrate the effectiveness of our technique compared with that of the visual-audio alignment techniques.