The virtual cinematographer: a paradigm for automatic real-time camera control and directing
SIGGRAPH '96 Proceedings of the 23rd annual conference on Computer graphics and interactive techniques
A Model of Saliency-Based Visual Attention for Rapid Scene Analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
Mean Shift: A Robust Approach Toward Feature Space Analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
Looking into video frames on small displays
MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
An Interactive Camera Planning System for Automatic Cinematographer
MMM '05 Proceedings of the 11th International Multimedia Modelling Conference
Learning user interest for image browsing on small-form-factor devices
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Region extraction of a gaze object using the gaze point and view image sequences
ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
Video retargeting: automating pan and scan
MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
2006 Special Issue: Modeling attention to salient proto-objects
Neural Networks
Watch what I watch: using community activity to understand content
Proceedings of the international workshop on Workshop on multimedia information retrieval
Proceedings of the 15th international conference on Multimedia
Improved seam carving for video retargeting
ACM SIGGRAPH 2008 papers
Re-cinematography: Improving the camerawork of casual video
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Nonchronological Video Synopsis and Indexing
IEEE Transactions on Pattern Analysis and Machine Intelligence
Supporting zoomable video streams with dynamic region-of-interest cropping
MMSys '10 Proceedings of the first annual ACM SIGMM conference on Multimedia systems
Unsupervised extraction of visual attention objects in color images
IEEE Transactions on Circuits and Systems for Video Technology
Combining content-based analysis and crowdsourcing to improve user interaction with zoomable video
MM '11 Proceedings of the 19th ACM international conference on Multimedia
Multimodal semantics extraction from user-generated videos
Advances in Multimedia
Enhancing online 3D products through crowdsourcing
Proceedings of the ACM multimedia 2012 workshop on Crowdsourcing for multimedia
Proceedings of the 5th Workshop on Mobile Video
Surveillance video synopsis in the compressed domain for fast video browsing
Journal of Visual Communication and Image Representation
Understanding in-video dropouts and interaction peaks inonline lecture videos
Proceedings of the first ACM conference on Learning @ scale conference
Hi-index | 0.00 |
Screen size and display resolution limit the experience of watching videos on mobile devices. The viewing experience can be improved by determining important or interesting regions within the video (called regions of interest, or ROIs) and displaying only the ROIs to the viewer. Previous work focuses on analyzing the video content using visual attention model to infer the ROIs. Such content-based technique, however, has limitations. In this paper, we propose an alternative paradigm to infer ROIs from a video. We crowdsource from a large number of users through their implicit viewing behavior using a zoom and pan interface, and infer the ROIs from their collective wisdom. A retargeted video, consisting of relevant shots determined from historical users behavior, can be automatically generated and replayed to subsequent users who would prefer a less interactive viewing experience. This paper presents how we collect the user traces, infer the ROIs and their dynamics, group the ROIs into shots, and automatically reframe those shots to improve the aesthetics of the video. A user study with 48 participants shows that our automatically retargeted video is of comparable quality to one handcrafted by an expert user