Video scene retrieval with sign sequence matching based on audio features

  • Authors:
  • Keisuke Morisawa;Naoko Nitta;Noboru Babaguchi

  • Affiliations:
  • Graduate School of Engineering, Osaka University, Japan;Graduate School of Engineering, Osaka University, Japan;Graduate School of Engineering, Osaka University, Japan

  • Venue:
  • PCM'04 Proceedings of the 5th Pacific Rim Conference on Advances in Multimedia Information Processing - Volume Part II
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose a method of quickly retrieving semantically similar scenes to a query video segment from large-scale videos with audio features. This method first classifies the sound of the target and query videos into voices and background sounds and extracts feature vectors by focusing on the sound sources. The feature vectors are then clustered by K-means algorithm and the cluster ID, which we call sign, is assigned to the feature vectors in the corresponding cluster, consequently representing a video segment as a sign sequence. Finally, the video scenes are retrieved by sign sequences matching using Dynamic Programming. The experimental results show this method is potentially useful for scene retrieval.