BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Feature Detection with Automatic Scale Selection
International Journal of Computer Vision
Modern Information Retrieval
An Affine Invariant Interest Point Detector
ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part I
Feature Based Methods for Structure and Motion Estimation
ICCV '99 Proceedings of the International Workshop on Vision Algorithms: Theory and Practice
Object Recognition from Local Scale-Invariant Features
ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
Video Google: A Text Retrieval Approach to Object Matching in Videos
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
A Bayesian Approach to Unsupervised One-Shot Learning of Object Categories
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Managing video collections at large
Proceedings of the 1st international workshop on Computer vision meets databases
Segmenting, modeling, and matching video clips containing multiple moving objects
CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Learning user queries in multimodal dissimilarity spaces
AMR'05 Proceedings of the Third international conference on Adaptive Multimedia Retrieval: user, context, and feedback
The ARGOS campaign: Evaluation of video analysis and indexing tools
Image Communication
A Probabilistic Model for User Relevance Feedback on Image Retrieval
MLMI '08 Proceedings of the 5th international workshop on Machine Learning for Multimodal Interaction
Learning user queries in multimodal dissimilarity spaces
AMR'05 Proceedings of the Third international conference on Adaptive Multimedia Retrieval: user, context, and feedback
Hi-index | 0.00 |
This paper addresses the problem of retrieving video sequences that contain a spatio-temporal pattern queried by a user. To achieve this, the visual content of each video sequence is first decomposed through the analysis of its local feature dynamics. Camera motion of the sequence, background and objects present in the captured scene and events occurring within it are represented respectively by the parameters of the estimated global motion model, the appearance of the extracted local features and their trajectories. At query-time, a probabilistic model of the visual pattern is estimated from the user interaction, captured through a relevance-feedback loop. We show that the method permits to efficiently retrieve video sequences that share, even partially, a spatio-temporal pattern.