Place retrieval with graph-based place-view model

  • Authors:
  • Xiaoshuai Sun;Rongrong Ji;Hongxun Yao;Pengfei Xu;Tianqiang Liu;Xianming Liu

  • Affiliations:
  • Harbin Institute of Technology, Harbin, China;Harbin Institute of Technology, Harbin, China;Harbin Institute of Technology, Harbin, China;Harbin Institute of Technology, Harbin, China;Harbin Institute of Technology, Harbin, China;Harbin Institute of Technology, Harbin, China

  • Venue:
  • MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Places in movies and sitcoms could indicate higher-level semantic cues about the story scenarios and actor relations. This paper presents a novel unsupervised framework for efficient place retrieval in movies and sitcoms. We leverage face detection to filter out close-up frames from video dataset, and adopt saliency map analysis to partition background places from foreground actions. Consequently, we extract pyramid-based spatial-encoding correlogram from shot key frames for robust place representation. For effectively describing variant place appearances, we cluster key frames and model inter-cluster belonging of identical place by inside-shot association. Then hierarchical normalized cut is utilized over the association graph to differentiate physical places within videos and gain their multi-view representation as a tree structure. For efficient place matching in large-scale database, inversed indexing is applied onto the hierarchical graph structure, based on which approximate nearest neighbor search is proposed to largely accelerate search process. Experimental results on over 36-hour Friends sitcom database demonstrate the effectiveness, efficiency, and semantic revealing ability of our framework.