Crowdsourced automatic zoom and scroll for video retargeting

Authors:
Axel Carlier;Vincent Charvillat;Wei Tsang Ooi;Romulus Grigoras;Geraldine Morin
Affiliations:
National University of Singapore , Singapore, Singapore;University of Toulouse , Toulouse, France;National University of Singapore, Singapore, Singapore;University of Toulouse , Toulouse, France;University of Toulouse, Toulouse, France
Venue:
Proceedings of the international conference on Multimedia
Year:
2010

Citing 16
Cited 6

The virtual cinematographer: a paradigm for automatic real-time camera control and directing

SIGGRAPH '96 Proceedings of the 23rd annual conference on Computer graphics and interactive techniques
A Model of Saliency-Based Visual Attention for Rapid Scene Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
Mean Shift: A Robust Approach Toward Feature Space Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
Looking into video frames on small displays

MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
An Interactive Camera Planning System for Automatic Cinematographer

MMM '05 Proceedings of the 11th International Multimedia Modelling Conference
Learning user interest for image browsing on small-form-factor devices

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Region extraction of a gaze object using the gaze point and view image sequences

ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
Video retargeting: automating pan and scan

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
2006 Special Issue: Modeling attention to salient proto-objects

Neural Networks
Watch what I watch: using community activity to understand content

Proceedings of the international workshop on Workshop on multimedia information retrieval
Multi-scale video cropping

Proceedings of the 15th international conference on Multimedia
Improved seam carving for video retargeting

ACM SIGGRAPH 2008 papers
Re-cinematography: Improving the camerawork of casual video

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Nonchronological Video Synopsis and Indexing

IEEE Transactions on Pattern Analysis and Machine Intelligence
Supporting zoomable video streams with dynamic region-of-interest cropping

MMSys '10 Proceedings of the first annual ACM SIGMM conference on Multimedia systems
Unsupervised extraction of visual attention objects in color images

IEEE Transactions on Circuits and Systems for Video Technology

Combining content-based analysis and crowdsourcing to improve user interaction with zoomable video

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Multimodal semantics extraction from user-generated videos

Advances in Multimedia
Enhancing online 3D products through crowdsourcing

Proceedings of the ACM multimedia 2012 workshop on Crowdsourcing for multimedia
A novel scheme of ROI detection and transcoding for mobile devices in high-definition videoconferencing

Proceedings of the 5th Workshop on Mobile Video
Surveillance video synopsis in the compressed domain for fast video browsing

Journal of Visual Communication and Image Representation
Understanding in-video dropouts and interaction peaks inonline lecture videos

Proceedings of the first ACM conference on Learning @ scale conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

Screen size and display resolution limit the experience of watching videos on mobile devices. The viewing experience can be improved by determining important or interesting regions within the video (called regions of interest, or ROIs) and displaying only the ROIs to the viewer. Previous work focuses on analyzing the video content using visual attention model to infer the ROIs. Such content-based technique, however, has limitations. In this paper, we propose an alternative paradigm to infer ROIs from a video. We crowdsource from a large number of users through their implicit viewing behavior using a zoom and pan interface, and infer the ROIs from their collective wisdom. A retargeted video, consisting of relevant shots determined from historical users behavior, can be automatically generated and replayed to subsequent users who would prefer a less interactive viewing experience. This paper presents how we collect the user traces, infer the ROIs and their dynamics, group the ROIs into shots, and automatically reframe those shots to improve the aesthetics of the video. A user study with 48 participants shows that our automatically retargeted video is of comparable quality to one handcrafted by an expert user