Efficiently scaling up video annotation with crowdsourced marketplaces

Authors:
Carl Vondrick;Deva Ramanan;Donald Patterson
Affiliations:
Department of Computer Science, University of California, Irvine;Department of Computer Science, University of California, Irvine;Department of Computer Science, University of California, Irvine
Venue:
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
Year:
2010

Citing 7
Cited 10

Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope

International Journal of Computer Vision
Histograms of Oriented Gradients for Human Detection

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Ensemble Tracking

IEEE Transactions on Pattern Analysis and Machine Intelligence
LabelMe: A Database and Web-Based Tool for Image Annotation

International Journal of Computer Vision
LIBLINEAR: A Library for Large Linear Classification

The Journal of Machine Learning Research
80 Million Tiny Images: A Large Data Set for Nonparametric Object and Scene Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
Who are the crowdworkers?: shifting demographics in mechanical turk

CHI '10 Extended Abstracts on Human Factors in Computing Systems

Crowdsourced data collection of facial responses

ICMI '11 Proceedings of the 13th international conference on multimodal interfaces
Crowdsourcing micro-level multimedia annotations: the challenges of evaluation and interface

Proceedings of the ACM multimedia 2012 workshop on Crowdsourcing for multimedia
Sentiment analysis using a novel human computation game

Proceedings of the 3rd Workshop on the People's Web Meets NLP: Collaboratively Constructed Semantic Resources and their Applications to NLP
Active frame selection for label propagation in videos

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part V
Translating related words to videos and back through latent topics

Proceedings of the sixth ACM international conference on Web search and data mining
Efficiently Scaling up Crowdsourced Video Annotation

International Journal of Computer Vision
An evaluation of labelling-game data for video retrieval

ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
A crowdsourcing approach to support video annotation

Proceedings of the International Workshop on Video and Image Ground Truth in Computer Vision Applications
Persuasiveness in social multimedia: the role of communication modality and the challenge of crowdsourcing annotations

Proceedings of the 15th ACM on International conference on multimodal interaction
Toward crowdsourcing micro-level behavior annotations: the challenges of interface, training, and generalization

Proceedings of the 19th international conference on Intelligent User Interfaces

Quantified Score

Hi-index	0.00

Visualization

Abstract

Accurately annotating entities in video is labor intensive and expensive. As the quantity of online video grows, traditional solutions to this task are unable to scale to meet the needs of researchers with limited budgets. Current practice provides a temporary solution by paying dedicated workers to label a fraction of the total frames and otherwise settling for linear interpolation. As budgets and scale require sparser key frames, the assumption of linearity fails and labels become inaccurate. To address this problem we have created a public framework for dividing the work of labeling video data into micro-tasks that can be completed by huge labor pools available through crowdsourced marketplaces. By extracting pixel-based features from manually labeled entities, we are able to leverage more sophisticated interpolation between key frames to maximize performance given a budget. Finally, by validating the power of our framework on difficult, real-world data sets we demonstrate an inherent trade-off between the mix of human and cloud computing used vs. the accuracy and cost of the labeling.