Place retrieval with graph-based place-view model

Authors:
Xiaoshuai Sun;Rongrong Ji;Hongxun Yao;Pengfei Xu;Tianqiang Liu;Xianming Liu
Affiliations:
Harbin Institute of Technology, Harbin, China;Harbin Institute of Technology, Harbin, China;Harbin Institute of Technology, Harbin, China;Harbin Institute of Technology, Harbin, China;Harbin Institute of Technology, Harbin, China;Harbin Institute of Technology, Harbin, China
Venue:
MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
Year:
2008

Citing 9
Cited 2

A Model of Saliency-Based Visual Attention for Rapid Scene Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
Normalized Cuts and Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Content-Based Image Retrieval at the End of the Early Years

IEEE Transactions on Pattern Analysis and Machine Intelligence
Image Indexing Using Color Correlograms

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Robust Real-Time Face Detection

International Journal of Computer Vision
A Bayesian Hierarchical Model for Learning Natural Scene Categories

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
A unified shot boundary detection framework based on graph partition model

Proceedings of the 13th annual ACM international conference on Multimedia
A Visual Attention Based Region-of-Interest Determination Framework for Video Sequences*

IEICE - Transactions on Information and Systems
Rapid Biologically-Inspired Scene Classification Using Features Shared with Visual Attention

IEEE Transactions on Pattern Analysis and Machine Intelligence

VisualCor system: search actor correlations in TV series

Proceedings of the First International Conference on Internet Multimedia Computing and Service
Actor-independent action search using spatiotemporal vocabulary with appearance hashing

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

Places in movies and sitcoms could indicate higher-level semantic cues about the story scenarios and actor relations. This paper presents a novel unsupervised framework for efficient place retrieval in movies and sitcoms. We leverage face detection to filter out close-up frames from video dataset, and adopt saliency map analysis to partition background places from foreground actions. Consequently, we extract pyramid-based spatial-encoding correlogram from shot key frames for robust place representation. For effectively describing variant place appearances, we cluster key frames and model inter-cluster belonging of identical place by inside-shot association. Then hierarchical normalized cut is utilized over the association graph to differentiate physical places within videos and gain their multi-view representation as a tree structure. For efficient place matching in large-scale database, inversed indexing is applied onto the hierarchical graph structure, based on which approximate nearest neighbor search is proposed to largely accelerate search process. Experimental results on over 36-hour Friends sitcom database demonstrate the effectiveness, efficiency, and semantic revealing ability of our framework.