Behavior and properties of spatio-temporal local features under visual transformations

Authors:
Julian Stöttinger;Bogdan Tudor Goras;Nicu Sebe;Allan Hanbury
Affiliations:
Vienna University of Technology, Vienna, Austria;Technical University of Iasi, Iasi, Romania;University of Trento, Povo - Trento, Italy;Information Retrieval Facility, Vienna, Austria
Venue:
Proceedings of the international conference on Multimedia
Year:
2010

Citing 7
Cited 1

Space-time Interest Points

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Scale & Affine Invariant Interest Point Detectors

International Journal of Computer Vision
Quasiconvex Optimization for Robust Geometric Reconstruction

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Kernel-based Recognition of Human Actions Using Spatiotemporal Salient Points

CVPRW '06 Proceedings of the 2006 Conference on Computer Vision and Pattern Recognition Workshop
Actions as Space-Time Shapes

IEEE Transactions on Pattern Analysis and Machine Intelligence
An Efficient Dense and Scale-Invariant Spatio-Temporal Interest Point Detector

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part II
View-Independent Action Recognition from Temporal Self-Similarities

IEEE Transactions on Pattern Analysis and Machine Intelligence

The synergy of 3d SIFT and sparse codes for classification of viewpoints from echocardiogram videos

MCBR-CDS'12 Proceedings of the Third MICCAI international conference on Medical Content-Based Retrieval for Clinical Decision Support

Quantified Score

Hi-index	0.00

Visualization

Abstract

Successful state-of-the-art video retrieval and classification applications are predominantly carried out by means of spatio-temporal features. Typically, the evaluation of these tasks is exclusively done based on their final performance but no systematic analysis of feature robustness, invariance and stability has been done yet for large scale video retrieval. In this work, we analyze the impact of visual transformation on spatio-temporal features in large scale experiments. Following the recipe of recent state of the art evaluations, we choose the best performing approaches, namely the spatio-temporal Harris3D, Hessian3D, and Cuboid detectors and the HOG/HOF, SURF3D, and HOG3D descriptors. We show that these features have different properties and behave differently under varying transformations (challenges). This helps researchers to justify the choice of features for new applications and helps to optimize the choice of input video in terms of resolution, compression, frames per second or noise suppression. We make the extracted features accessible on-line for further independent evaluation and applications.