VideoCut: Removing Irrelevant Frames by Discovering the Object of Interest

Authors:
David Liu;Gang Hua;Tsuhan Chen
Affiliations:
Dept. of ECE, Carnegie Mellon University,;Microsoft Live Labs,;Dept. of ECE, Carnegie Mellon University,
Venue:
ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part I
Year:
2008

Citing 11
Cited 2

Tracking and data association

Tracking and data association
Probabilistic reasoning in intelligent systems: networks of plausible inference

Probabilistic reasoning in intelligent systems: networks of plausible inference
Analyzing the effectiveness and applicability of co-training

Proceedings of the ninth international conference on Information and knowledge management
Exploiting unlabeled data in ensemble methods

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Video Google: A Text Retrieval Approach to Object Matching in Videos

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Object Detection Using the Statistics of Parts

International Journal of Computer Vision
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Semi-Supervised Self-Training of Object Detection Models

WACV-MOTION '05 Proceedings of the Seventh IEEE Workshops on Application of Computer Vision (WACV/MOTION'05) - Volume 1 - Volume 01
One-Shot Learning of Object Categories

IEEE Transactions on Pattern Analysis and Machine Intelligence
Object Level Grouping for Video Shots

International Journal of Computer Vision
Coloring local feature extraction

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part II

Automatic Detection of Object of Interest and Tracking in Active Video

PCM '09 Proceedings of the 10th Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
Automatic Detection of Object of Interest and Tracking in Active Video

Journal of Signal Processing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a novel method for removing irrelevant frames from a video given user-provided frame-level labeling for a very small number of frames. We first hypothesize a number of candidate areas which possibly contain the object of interest, and then figure out which area(s) truly contain the object of interest. Our method enjoys several favorable properties. First, compared to approaches where a single descriptor is used to describe a whole frame, each area's feature descriptor has the chance of genuinely describing the object of interest, hence it is less affected by background clutter. Second, by considering the temporal continuity of a video instead of treating the frames as independent, we can hypothesize the location of the candidate areas more accurately. Third, by infusing prior knowledge into the topic-motion model, we can precisely follow the trajectory of the object of interest. This allows us to largely reduce the number of candidate areas and hence reduce the chance of overfitting the data during learning. We demonstrate the effectiveness of the method by comparing it to several other semi-supervised learning approaches on challenging video clips.