The Recognition of Human Movement Using Temporal Templates
IEEE Transactions on Pattern Analysis and Machine Intelligence
Unsupervised document classification using sequential information maximization
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Recognizing Action at a Distance
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Recognizing Human Actions: A Local SVM Approach
ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 3 - Volume 03
Space-Time Behavior Based Correlation
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Learning a Mahalanobis Metric from Equivalence Constraints
The Journal of Machine Learning Research
ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Object Categorization by Learned Universal Visual Dictionary
ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Behavior recognition via sparse spatio-temporal features
ICCCN '05 Proceedings of the 14th International Conference on Computer Communications and Networks
Detecting Irregularities in Images and in Video
International Journal of Computer Vision
Information-theoretic metric learning
Proceedings of the 24th international conference on Machine learning
Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words
International Journal of Computer Vision
Spatial-Temporal correlatons for unsupervised action classification
WMVC '08 Proceedings of the 2008 IEEE Workshop on Motion and video Computing
Hi-index | 0.00 |
Recently, bag of spatio-temporal local features based methods have received significant attention in human action recognition. However, it remains a big challenge to overcome intra-class variations in cases of viewpoint, geometric and illumination variance. In this paper we present Bag of Spatio-temporal Synonym Sets (ST-SynSets) to represent human actions, which can partially bridge the semantic gap between visual appearances and category semantics. Firstly, it re-clusters the original visual words into a higher level ST-SynSet based on the distribution consistency among different action categories using Information Bottleneck clustering method. Secondly, it adaptively learns a distance metric with both the visual and semantic constraints for ST-SynSets projection. Experiments and comparison with state-of-art methods show the effectiveness and robustness of the proposed method for human action recognition, especially in multiple viewpoints and illumination conditions.