Video scene retrieval with sign sequence matching based on audio features

Authors:
Keisuke Morisawa;Naoko Nitta;Noboru Babaguchi
Affiliations:
Graduate School of Engineering, Osaka University, Japan;Graduate School of Engineering, Osaka University, Japan;Graduate School of Engineering, Osaka University, Japan
Venue:
PCM'04 Proceedings of the 5th Pacific Rim Conference on Advances in Multimedia Information Processing - Volume Part II
Year:
2004

Citing 2
Cited 2

Video Handling with Music and Speech Detection

IEEE MultiMedia
Construction and Evaluation of a Robust Multifeature Speech/Music Discriminator

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2

Experience retrieval in a ubiquitous home

CARPE '05 Proceedings of the 2nd ACM workshop on Continuous archival and retrieval of personal experiences
Evaluation of video summarization for a large number of cameras in ubiquitous home

Proceedings of the 13th annual ACM international conference on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose a method of quickly retrieving semantically similar scenes to a query video segment from large-scale videos with audio features. This method first classifies the sound of the target and query videos into voices and background sounds and extracts feature vectors by focusing on the sound sources. The feature vectors are then clustered by K-means algorithm and the cluster ID, which we call sign, is assigned to the feature vectors in the corresponding cluster, consequently representing a video segment as a sign sequence. Finally, the video scenes are retrieved by sign sequences matching using Dynamic Programming. The experimental results show this method is potentially useful for scene retrieval.