A dataset and evaluation methodology for visual saliency in video

Authors:
Jia Li;Yonghong Tian;Tiejun Huang;Wen Gao
Affiliations:
Key Lab of Intell. Info. Process, Inst. of Comput. Tech., Chinese Academy of Sciences and Graduate University of Chinese Academy of Sciences, China;Institute of Digital Media, School of EE & CS, Peking University, China;Institute of Digital Media, School of EE & CS, Peking University, China;Key Lab of Intell. Info. Process, Inst. of Comput. Tech., Chinese Academy of Sciences and Institute of Digital Media, School of EE & CS, Peking University, China
Venue:
ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Year:
2009

Citing 6
Cited 3

A Model of Saliency-Based Visual Attention for Rapid Scene Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Principled Approach to Detecting Surprising Events in Video

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Visual attention detection in video sequences using spatiotemporal cues

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Efficient spatiotemporal-attention-driven shot matching

Proceedings of the 15th international conference on Multimedia
Contextual in-image advertising

MM '08 Proceedings of the 16th ACM international conference on Multimedia
How to find interesting locations in video: a spatiotemporal interest point detector learned from human eye movements

Proceedings of the 29th DAGM conference on Pattern recognition

Video retargeting with multi-scale trajectory optimization

Proceedings of the international conference on Multimedia information retrieval
Probabilistic Multi-Task Learning for Visual Saliency Estimation in Video

International Journal of Computer Vision
Salient object detection: a benchmark

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recently, visual saliency has drawn great research interest in the field of computer vision and multimedia. Various approaches aiming at calculating visual saliency have been proposed. To evaluate these approaches, several datasets have been presented for visual saliency in images. However, there are few datasets to capture spatiotemporal visual saliency in video. Intuitively, visual saliency in video is strongly affected by temporal context and might vary significantly even in visually similar frames. In this paper, we present an extensive dataset with 7.5-hour videos to capture spatiotemporal visual saliency. The salient regions in frames sequentially sampled from these videos are manually labeled by 23 subjects and then averaged to generate the ground-truth saliency maps. We also present three metrics to evaluate competing approaches. Several typical algorithms were evaluated on the dataset. The experimental results show that this dataset is very suitable for evaluating visual saliency. We also discover some interesting findings that would be addressed in future research. Currently, the dataset is freely available online together with the source code for evaluation.