A 3-dimensional sift descriptor and its application to action recognition

Authors:
Paul Scovanner;Saad Ali;Mubarak Shah
Affiliations:
University of Central Florida, Orlando;University of Central Florida, Orlando;University of Central Florida, Orlando
Venue:
Proceedings of the 15th international conference on Multimedia
Year:
2007

Citing 9
Cited 115

Object Recognition from Local Scale-Invariant Features

ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
Recognizing Action at a Distance

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Recognising Panoramas

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Actions Sketch: A Novel Action Representation

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Efficient Visual Event Detection Using Volumetric Features

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Actions as Space-Time Shapes

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Real-time eye blink detection with GPU-based SIFT tracking

CRV '07 Proceedings of the Fourth Canadian Conference on Computer and Robot Vision

Applying SIFT Descriptors to Stellar Image Matching

CIARP '08 Proceedings of the 13th Iberoamerican congress on Pattern Recognition: Progress in Pattern Recognition, Image Analysis and Applications
Attention-driven action retrieval with DTW-based 3d descriptor matching

MM '08 Proceedings of the 16th ACM international conference on Multimedia
Learning to Recognize Activities from the Wrong View Point

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part I
Scale Invariant Action Recognition Using Compound Features Mined from Dense Spatio-temporal Corners

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part I
Human Activity Recognition with Metric Learning

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part I
Volumetric Ultrasound Panorama Based on 3D SIFT

MICCAI '08 Proceedings of the 11th International Conference on Medical Image Computing and Computer-Assisted Intervention, Part II
An Ultrasound-Guided Organ Biopsy Simulation with 6DOF Haptic Feedback

MICCAI '08 Proceedings of the 11th International Conference on Medical Image Computing and Computer-Assisted Intervention, Part II
Speeding up spatio-temporal sliding-window search for efficient event detection in crowded videos

EiMM '09 Proceedings of the 1st ACM international workshop on Events in multimedia
Tracking HoG Descriptors for Gesture Recognition

AVSS '09 Proceedings of the 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance
Relevance of Interest Points for Eye Position Prediction on Videos

ICVS '09 Proceedings of the 7th International Conference on Computer Vision Systems: Computer Vision Systems
Silhouette representation and matching for 3D pose discrimination - A comparative study

Image and Vision Computing
Human action recognition in videos using motion impression image

Proceedings of the First International Conference on Internet Multimedia Computing and Service
Vlogging: A survey of videoblogging technology on the web

ACM Computing Surveys (CSUR)
A survey on vision-based human action recognition

Image and Vision Computing
Eigen-space learning using semi-supervised diffusion maps for human action recognition

Proceedings of the ACM International Conference on Image and Video Retrieval
MI-SIFT: mirror and inversion invariant generalization for SIFT descriptor

Proceedings of the ACM International Conference on Image and Video Retrieval
Feature detector and descriptor evaluation in human action recognition

Proceedings of the ACM International Conference on Image and Video Retrieval
On interest point detection under a landmark-based medical image registration context

ICIP'09 Proceedings of the 16th IEEE international conference on Image processing
Recognizing human actions by fusing spatio-temporal appearance and motion descriptors

ICIP'09 Proceedings of the 16th IEEE international conference on Image processing
Incremental discriminant-analysis of canonical correlations for action recognition

Pattern Recognition
Non-parametric anomaly detection exploiting space-time features

Proceedings of the international conference on Multimedia
Human action recognition using key points displacement

ICISP'10 Proceedings of the 4th international conference on Image and signal processing
Dense spatio-temporal features for non-parametric anomaly detection and localization

Proceedings of the first ACM international workshop on Analysis and retrieval of tracked events and motion in imagery streams
Video based mobile location search with large set of SIFT points in cloud

Proceedings of the 2010 ACM multimedia workshop on Mobile cloud media computing
Actor-independent action search using spatiotemporal vocabulary with appearance hashing

Pattern Recognition
Discovering motion patterns for human action recognition

PCM'10 Proceedings of the Advances in multimedia information processing, and 11th Pacific Rim conference on Multimedia: Part II
Human activity analysis: A review

ACM Computing Surveys (CSUR)
3D object detection using a fast voxel-wise local spherical Fourier tensor transformation

Proceedings of the 32nd DAGM conference on Pattern recognition
Recognizing human actions using NWFE-based histogram vectors

EURASIP Journal on Advances in Signal Processing - Special issue on video analysis for human behavior understanding
A survey of vision-based methods for action representation, segmentation and recognition

Computer Vision and Image Understanding
Event detection and recognition for semantic annotation of video

Multimedia Tools and Applications
Human action recognition by SOM considering the probability of spatio-temporal features

ICONIP'10 Proceedings of the 17th international conference on Neural information processing: models and applications - Volume Part II
3D human pose recovery from image by efficient visual feature selection

Computer Vision and Image Understanding
Learning human action sequence style from video for transfer to 3D game characters

MIG'10 Proceedings of the Third international conference on Motion in games
Sparse coding on local spatial-temporal volumes for human action recognition

ACCV'10 Proceedings of the 10th Asian conference on Computer vision - Volume Part II
Top-down cues for event recognition

ACCV'10 Proceedings of the 10th Asian conference on Computer vision - Volume Part III
Adaptive learning codebook for action recognition

Pattern Recognition Letters
Recent advances and trends in visual tracking: A review

Neurocomputing
Space-time Zernike moments and pyramid kernel descriptors for action classification

ICIAP'11 Proceedings of the 16th international conference on Image analysis and processing - Volume Part II
Exploring alternative spatial and temporal dense representations for action recognition

CAIP'11 Proceedings of the 14th international conference on Computer analysis of images and patterns - Volume Part II
Human action recognition using Pose-based discriminant embedding

Image Communication
Relevance feedback for real-world human action retrieval

Pattern Recognition Letters
Multi-scale and real-time non-parametric approach for anomaly detection and localization

Computer Vision and Image Understanding
Selective spatio-temporal interest points

Computer Vision and Image Understanding
Event retrieval in video archives using rough set theory and partially supervised learning

Multimedia Tools and Applications
Good match exploration using triangle constraint

Pattern Recognition Letters
Learning group activity in soccer videos from local motion

ACCV'09 Proceedings of the 9th Asian conference on Computer Vision - Volume Part I
Video Human Motion Recognition Using a Knowledge-Based Hybrid Method Based on a Hidden Markov Model

ACM Transactions on Intelligent Systems and Technology (TIST)
Human action recognition using HDP by integrating motion and location information

ACCV'09 Proceedings of the 9th Asian conference on Computer Vision - Volume Part II
MIFT: a mirror reflection invariant feature descriptor

ACCV'09 Proceedings of the 9th Asian conference on Computer Vision - Volume Part II
Event recognition based on a local space-time interest points and self-organization feature map method

ICIC'11 Proceedings of the 7th international conference on Advanced Intelligent Computing
Efficient spatio-temporal edge descriptor

MMM'12 Proceedings of the 18th international conference on Advances in Multimedia Modeling
Using a Product Manifold distance for unsupervised action recognition

Image and Vision Computing
Motion recognition using local auto-correlation of space-time gradients

Pattern Recognition Letters
Supervised class-specific dictionary learning for sparse modeling in action recognition

Pattern Recognition
Video Behaviour Mining Using a Dynamic Topic Model

International Journal of Computer Vision
Exploring probabilistic localized video representation for human action recognition

Multimedia Tools and Applications
Scale-space texture description on SIFT-like textons

Computer Vision and Image Understanding
Sparse Modeling of Human Actions from Motion Imagery

International Journal of Computer Vision
Fitting distal limb segments for accurate skeletonization in human action recognition

Journal of Ambient Intelligence and Smart Environments
MIFT: A framework for feature descriptors to be mirror reflection invariant

Image and Vision Computing
Time invariant gesture recognition by modelling body posture space

IEA/AIE'12 Proceedings of the 25th international conference on Industrial Engineering and Other Applications of Applied Intelligent Systems: advanced research in applied artificial intelligence
State of the Art Report on Video-Based Graphics and Video Visualization

Computer Graphics Forum
Predicting human activities using spatio-temporal structure of interest points

Proceedings of the 20th ACM international conference on Multimedia
Spatiotemporal descriptor for wide-baseline stereo reconstruction of non-rigid and ambiguous scenes

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part III
Directional space-time oriented gradients for 3d visual pattern analysis

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part III
Action recognition using subtensor constraint

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part III
Trajectory-Based modeling of human actions with motion reference points

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part V
Spatio-temporal SIFT and its application to human action classification

ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume Part I
A novel material-aware feature descriptor for volumetric image registration in diffusion tensor space

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part IV
Spatio-temporal video representation with locality-constrained linear coding

ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume Part III
A semi-supervised approach for dimensionality reduction with distributional similarity

Neurocomputing
An efficient approach for multi-view human action recognition based on bag-of-key-poses

HBU'12 Proceedings of the Third international conference on Human Behavior Understanding
An improved method of action recognition based on sparse spatio-temporal features

AIMSA'12 Proceedings of the 15th international conference on Artificial Intelligence: methodology, systems, and applications
Reordering video shots for event classification using bag-of-words models and string kernels

Proceedings of the 27th Conference on Image and Vision Computing New Zealand
Fast human action classification and VOI localization with enhanced sparse coding

Journal of Visual Communication and Image Representation
Recognizing Interactive Group Activities Using Temporal Interaction Matrices and Their Riemannian Statistics

International Journal of Computer Vision
Towards the automatic detection of spontaneous agreement and disagreement based on nonverbal behaviour: A survey of related cues, databases, and tools

Image and Vision Computing
Human action recognition based on boosted feature selection and naive Bayes nearest-neighbor classification

Signal Processing
3D CBIR with sparse coding for image-guided neurosurgery

Signal Processing
Manifold-constrained coding and sparse representation for human action recognition

Pattern Recognition
The synergy of 3d SIFT and sparse codes for classification of viewpoints from echocardiogram videos

MCBR-CDS'12 Proceedings of the Third MICCAI international conference on Medical Content-Based Retrieval for Clinical Decision Support
Mining spatiotemporal video patterns towards robust action retrieval

Neurocomputing
A comparison of 3D interest point descriptors with application to airport baggage object detection in complex CT imagery

Pattern Recognition
Action disambiguation analysis using normalized google-like distance correlogram

ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part III
A review of motion analysis methods for human Nonverbal Communication Computing

Image and Vision Computing
Special Section on 3D Object Retrieval: Evaluating 3D spatial pyramids for classifying 3D shapes

Computers and Graphics
Cooperative augmentation of mobile smart objects with projected displays

ACM Transactions on Interactive Intelligent Systems (TiiS) - Special issue on interaction with smart objects, Special section on eye gaze and conversation
Exploring dense trajectory feature and encoding methods for human interaction recognition

Proceedings of the Fifth International Conference on Internet Multimedia Computing and Service
Exact and easy guidance with visual navigation situation for mobile user

Proceedings of the Fifth International Conference on Internet Multimedia Computing and Service
Learning latent spatio-temporal compositional model for human action recognition

Proceedings of the 21st ACM international conference on Multimedia
Human action recognition with salient trajectories

Signal Processing
Activity detection and recognition of daily living events

Proceedings of the 1st ACM international workshop on Multimedia indexing and information retrieval for healthcare
Large scale continuous visual event recognition using max-margin Hough transformation framework

Computer Vision and Image Understanding
An on-line, real-time learning method for detecting anomalies in videos using spatio-temporal compositions

Computer Vision and Image Understanding
Ongoing human action recognition with motion capture

Pattern Recognition
Unsupervised approximate-semantic vocabulary learning for human action and video classification

Pattern Recognition Letters
A local descriptor based on Laplacian pyramid coding for action recognition

Pattern Recognition Letters
Human activity modeling by spatio temporal textural appearance

Pattern Recognition Letters
Multiple scale-specific representations for improved human action recognition

Pattern Recognition Letters
Silhouette-based human action recognition using sequences of key poses

Pattern Recognition Letters
Learning discriminative representations from RGB-D video data

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Accelerated robust point cloud registration in natural environments through positive and unlabeled learning

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Visualization and analysis of 3D time-varying simulations with time lines

Journal of Visual Languages and Computing
Editor's Choice Article: Human activity recognition in videos using a single example

Image and Vision Computing
Classifying web videos using a global video descriptor

Machine Vision and Applications
Automated pulmonary nodule detection based on three-dimensional shape-based feature descriptor

Computer Methods and Programs in Biomedicine
Multi-Max-Margin Support Vector Machine for multi-source human action recognition

Neurocomputing
Depth sensor assisted real-time gesture recognition for interactive presentation

Journal of Visual Communication and Image Representation
Language-motivated approaches to action recognition

The Journal of Machine Learning Research
Action recognition using 3D DAISY descriptor

Machine Vision and Applications
Moving people tracking with detection by latent semantic analysis for visual surveillance applications

Multimedia Tools and Applications
Weighted feature trajectories and concatenated bag-of-features for action recognition

Neurocomputing
Dynamic facial expression analysis based on extended spatio-temporal histogram of oriented gradients

International Journal of Biometrics
Recognizing activities in multiple views with fusion of frame judgments

Image and Vision Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we introduce a 3-dimensional (3D) SIFT descriptor for video or 3D imagery such as MRI data. We also show how this new descriptor is able to better represent the 3D nature of video data in the application of action recognition. This paper will show how 3D SIFT is able to outperform previously used description methods in an elegant and efficient manner. We use a bag of words approach to represent videos, and present a method to discover relationships between spatio-temporal words in order to better describe the video data.