Making a scene: alignment of complete sets of clips based on pairwise audio match

Authors:
Kai Su;Mor Naaman;Avadhut Gurjar;Mohsin Patel;Daniel P. W. Ellis
Affiliations:
Rutgers University;Rutgers University;Rutgers University;Rutgers University;Columbia University
Venue:
Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
Year:
2012

Citing 14
Cited 1

Syntactic clustering of the Web

Selected papers from the sixth international conference on World Wide Web
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions

The Journal of Machine Learning Research
Clustering near-duplicate images in large collections

Proceedings of the international workshop on Workshop on multimedia information retrieval
Synchronization of multi-camera video recordings based on audio

Proceedings of the 15th international conference on Multimedia
Introduction to Information Retrieval

Introduction to Information Retrieval
Less talk, more rock: automated organization of community-contributed collections of concert videos

Proceedings of the 18th international conference on World wide web
A comparison of extrinsic clustering evaluation metrics based on formal constraints

Information Retrieval
Unstructured video-based rendering: interactive exploration of casually captured videos

ACM SIGGRAPH 2010 papers
Automatic mashup generation from multiple-camera concert recordings

Proceedings of the international conference on Multimedia
Crowdsourcing rock n' roll multimedia retrieval

Proceedings of the international conference on Multimedia
Finding media illustrating events

Proceedings of the 1st ACM International Conference on Multimedia Retrieval
Automatic generation of video narratives from shared UGC

Proceedings of the 22nd ACM conference on Hypertext and hypermedia
A Cluster Separation Measure

IEEE Transactions on Pattern Analysis and Machine Intelligence
Identifying content for planned events across social media sites

Proceedings of the fifth ACM international conference on Web search and data mining

Socially-aware multimedia authoring: Past, present, and future

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP) - Special Sections on the 20th Anniversary of ACM International Conference on Multimedia, Best Papers of ACM Multimedia 2012

Quantified Score

Hi-index	0.00

Visualization

Abstract

As the amount of social video content captured at physical-world events, and shared online, is rapidly increasing, there is a growing need for robust methods for organization and presentation of the captured content. In this work, we significantly extend prior work that examined automatic detection of videos from events that were captured at the same time, i.e. "overlapping". We go beyond finding pairwise matches between video clips and describe the construction of scenes, or sets of multiple overlapping videos, each scene presenting a coherent moment in the event. We test multiple strategies for scene construction, using a greedy algorithm to create a mapping of videos into scenes, and a clustering refinement step to increase the precision of each scene. We evaluate the strategies in multiple settings and show that a greedy and clustering approach results in best possible balance between recall and precision for all settings.