ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
Recognizing Human Actions: A Local SVM Approach
ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 3 - Volume 03
A 3-dimensional sift descriptor and its application to action recognition
Proceedings of the 15th international conference on Multimedia
Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words
International Journal of Computer Vision
Action recognition in unconstrained amateur videos
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Spatio-Temporal Frames in a Bag-of-Visual-Features Approach for Human Actions Recognition
SIBGRAPI '09 Proceedings of the 2009 XXII Brazilian Symposium on Computer Graphics and Image Processing
Space-variant spatio-temporal filtering of video for gaze visualization and perceptual learning
Proceedings of the 2010 Symposium on Eye-Tracking Research & Applications
Feature detector and descriptor evaluation in human action recognition
Proceedings of the ACM International Conference on Image and Video Retrieval
Vlfeat: an open and portable library of computer vision algorithms
Proceedings of the international conference on Multimedia
Interpolative multiresolution coding of advance television with compatible subchannels
IEEE Transactions on Circuits and Systems for Video Technology
Hi-index | 0.00 |
This paper presents a space-time extension of scale-invariant feature transform (SIFT) originally applied to the 2-dimensional (2D) volumetric images. Most of the previous extensions dealt with 3-dimensional (3D) spacial information using a combination of a 2D detector and a 3D descriptor for applications such as medical image analysis. In this work we build a spatio-temporal difference-of-Gaussian (DoG) pyramid to detect the local extrema, aiming at processing video streams. Interest points are extracted not only from the spatial plane (xy) but also from the planes along the time axis (xt and yt). The space-time extension was evaluated using the human action classification task. Experiments with the KTH and the UCF sports datasets show that the approach was able to produce results comparable to the state-of-the-arts.