Audio-assisted scene segmentation for story browsing

Authors:
Yu Cao;Wallapak Tavanapong;Kihwan Kim;JungHwan Oh
Affiliations:
Department of Computer Science, Iowa State University, Ames, IA;Department of Computer Science, Iowa State University, Ames, IA;Department of Computer Science, Iowa State University, Ames, IA;Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, TX
Venue:
CIVR'03 Proceedings of the 2nd international conference on Image and video retrieval
Year:
2003

Citing 5
Cited 6

Determining computable scenes in films and their structures using audio-visual memory models

MULTIMEDIA '00 Proceedings of the eighth ACM international conference on Multimedia
A robust audio classification and segmentation method

MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
Construction and Evaluation of a Robust Multifeature Speech/Music Discriminator

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Shot clustering techniques for story browsing

IEEE Transactions on Multimedia
Automated high-level movie segmentation for advanced video-retrieval systems

IEEE Transactions on Circuits and Systems for Video Technology

Automatically selecting shots for action movie trailers

MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Associating characters with events in films

Proceedings of the 6th ACM international conference on Image and video retrieval
Indexing of fictional video content for event detection and summarisation

Journal on Image and Video Processing
The state of the art in image and video retrieval

CIVR'03 Proceedings of the 2nd international conference on Image and video retrieval
Multimodal data fusion for video scene segmentation

VISUAL'05 Proceedings of the 8th international conference on Visual Information and Information Systems
A system for event-based film browsing

TIDSE'06 Proceedings of the Third international conference on Technologies for Interactive Digital Storytelling and Entertainment

Quantified Score

Hi-index	0.00

Visualization

Abstract

Content-based video retrieval requires an effective scene segmentation technique to divide a long video file into meaningful high-level aggregates of shots called scenes. Each scene is part of a story. Browsing these scenes unfolds the entire story of a film. In this paper, we first investigate recent scene segmentation techniques that belong to the visual-audio alignment approach. This approach segments a video stream into visual scenes and an audio stream into audio scenes separately and later aligns these boundaries to create the final scene boundaries. In contrast, we propose a novel audio-assisted scene segmentation technique that utilizes audio information to remove false boundaries generated from segmentation by visual information alone. The crux of our technique is the new dissimilarity measure based on analysis of statistical properties of audio features and a concept in information theory. The experimental results on two full-length films with a wide range of camera motion and a complex composition of shots demonstrate the effectiveness of our technique compared with that of the visual-audio alignment techniques.