Unsupervised temporal commonality discovery

Authors:
Wen-Sheng Chu;Feng Zhou;Fernando De la Torre
Affiliations:
Robotics Institute, Carnegie Mellon University;Robotics Institute, Carnegie Mellon University;Robotics Institute, Carnegie Mellon University
Venue:
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part IV
Year:
2012

Citing 22
Cited 0

Algorithms on strings, trees, and sequences: computer science and computational biology

Algorithms on strings, trees, and sequences: computer science and computational biology
The Complexity of Some Problems on Subsequences and Supersequences

Journal of the ACM (JACM)
Longest Common Subsequences

MFCS '94 Proceedings of the 19th International Symposium on Mathematical Foundations of Computer Science 1994
Video Google: A Text Retrieval Approach to Object Matching in Videos

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Robust Real-Time Face Detection

International Journal of Computer Vision
Segmenting motion capture data into distinct behaviors

GI '04 Proceedings of the 2004 Graphics Interface Conference
Detecting Irregularities in Images and in Video

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Unsupervised Discovery of Action Classes

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
A stochastic grammar of images

Foundations and Trends® in Computer Graphics and Vision
Unsupervised view and rate invariant clustering of video sequences

Computer Vision and Image Understanding
Frame-level temporal calibration of unsynchronized cameras by using Longest Consecutive Common Subsequence

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Discovering multivariate motifs using subsequence density estimation and greedy mixture learning

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Efficient Subwindow Search: A Branch and Bound Framework for Object Localization

IEEE Transactions on Pattern Analysis and Machine Intelligence
Online discovery and maintenance of time series motifs

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
MOMI-cosegmentation: simultaneous segmentation of multiple objects among multiple images

ACCV'10 Proceedings of the 10th Asian conference on Computer vision - Volume Part I
Discriminative Video Pattern Search for Efficient Action Detection

IEEE Transactions on Pattern Analysis and Machine Intelligence
Efficient subwindow search with submodular score functions

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Scale invariant cosegmentation for image groups

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Action bank: A high-level representation of activity in video

CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Learning latent temporal structure for complex event detection

CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Unsupervised learning of event AND-OR grammar and semantics from video

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
Learning spatiotemporal graphs of human activities

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision

Quantified Score

Hi-index	0.00

Visualization

Abstract

Unsupervised discovery of commonalities in images has recently attracted much interest due to the need to find correspondences in large amounts of visual data. A natural extension, and a relatively unexplored problem, is how to discover common semantic temporal patterns in videos. That is, given two or more videos, find the subsequences that contain similar visual content in an unsupervised manner. We call this problem Temporal Commonality Discovery (TCD). The naive exhaustive search approach to solve the TCD problem has a computational complexity quadratic with the length of each sequence, making it impractical for regular-length sequences. This paper proposes an efficient branch and bound (B&B) algorithm to tackle the TCD problem. We derive tight bounds for classical distances between temporal bag of words of two segments, including ℓ1, intersection and χ2. Using these bounds the B&B algorithm can efficiently find the global optimal solution. Our algorithm is general, and it can be applied to any feature that has been quantified into histograms. Experiments on finding common facial actions in video and human actions in motion capture data demonstrate the benefits of our approach. To the best of our knowledge, this is the first work that addresses unsupervised discovery of common events in videos.