ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Scale & Affine Invariant Interest Point Detectors
International Journal of Computer Vision
Quasiconvex Optimization for Robust Geometric Reconstruction
ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Kernel-based Recognition of Human Actions Using Spatiotemporal Salient Points
CVPRW '06 Proceedings of the 2006 Conference on Computer Vision and Pattern Recognition Workshop
IEEE Transactions on Pattern Analysis and Machine Intelligence
An Efficient Dense and Scale-Invariant Spatio-Temporal Interest Point Detector
ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part II
View-Independent Action Recognition from Temporal Self-Similarities
IEEE Transactions on Pattern Analysis and Machine Intelligence
The synergy of 3d SIFT and sparse codes for classification of viewpoints from echocardiogram videos
MCBR-CDS'12 Proceedings of the Third MICCAI international conference on Medical Content-Based Retrieval for Clinical Decision Support
Hi-index | 0.00 |
Successful state-of-the-art video retrieval and classification applications are predominantly carried out by means of spatio-temporal features. Typically, the evaluation of these tasks is exclusively done based on their final performance but no systematic analysis of feature robustness, invariance and stability has been done yet for large scale video retrieval. In this work, we analyze the impact of visual transformation on spatio-temporal features in large scale experiments. Following the recipe of recent state of the art evaluations, we choose the best performing approaches, namely the spatio-temporal Harris3D, Hessian3D, and Cuboid detectors and the HOG/HOF, SURF3D, and HOG3D descriptors. We show that these features have different properties and behave differently under varying transformations (challenges). This helps researchers to justify the choice of features for new applications and helps to optimize the choice of input video in terms of resolution, compression, frames per second or noise suppression. We make the extracted features accessible on-line for further independent evaluation and applications.