Statistics of pairwise co-occurring local spatio-temporal features for human action recognition
ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume Part I
Action disambiguation analysis using normalized google-like distance correlogram
ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part III
Hi-index | 0.00 |
A common approach to activity recognition has been the use of histogram of codewords computed from Spatio Temporal Interest Points (STIPs). Recent methods have focused on leveraging the spatio-temporal neighborhood structure of the features, but they are generally restricted to aggregate statistics over the entire video volume, and ignore local pairwise relationships. Our goal is to capture these relations in terms of pairwise cooccurrence statistics of codewords. We show a reduction of such cooccurrence relations to the edges connecting the latent variables of a Conditional Random Field (CRF) classifier. As a consequence, we also learn the codeword dictionary as a part of the maximum likelihood learning process, with each interest point assigned a probability distribution over the codewords. We show results on two widely used activity recognition datasets.