Bag of visual words model for videos segmentation into scenes

  • Authors:
  • Junaid Baber;Shin'ichi Satoh;Nitin Afzulpurkar;Chadaporn Keatmanee

  • Affiliations:
  • Asian Institute of Technology, Thailand;National Institute of Informatics, Japan;Asian Institute of Technology, Thailand;Sripatum University, Thailand

  • Venue:
  • Proceedings of the Fifth International Conference on Internet Multimedia Computing and Service
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the advancement in multimedia technologies, the video databases are exponentially increasing in size which are creating many challenges for efficient indexing and retrieval of the videos. Interactive and efficient search engines allow users to query some part of the videos or search some particular scenes in the video databases. Segmentation and retrieval of videos scenes are getting popular which make video indexing more flexible and efficient. In this paper, we automatically segment the videos into scenes (visually related video shots). We first segment the videos into shots and then merge the shots which are visually similar. We represent shots by bag of visual words (BoVW) model and compute the similarity of shots with each other within the sliding window of length L, sliding window makes similarity computation efficient as the similarity of each shot is computed with its L neighbors only, instead of whole pool of shots. Experiments on cinematic videos and dramas show the effectiveness of proposed technique.