Video retrieval by mimicking poses

Authors:
Nataraj Jammalamadaka;Andrew Zisserman;Marcin Eichner;Vittorio Ferrari;C. V. Jawahar
Affiliations:
IIIT-Hyderabad, India;University of Oxford, UK;ETH Zurich, Switzerland;University of Edinburgh, UK;IIIT-Hyderabad, India
Venue:
Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
Year:
2012

Citing 9
Cited 2

Video Google: A Text Retrieval Approach to Object Matching in Videos

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Robust Real-Time Face Detection

International Journal of Computer Vision
Pictorial Structures for Object Recognition

International Journal of Computer Vision
Automatic Face Recognition for Film Character Retrieval in Feature-Length Films

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Actions as Space-Time Shapes

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Video parsing based on head tracking and face recognition

Proceedings of the 6th ACM international conference on Image and video retrieval
The Pascal Visual Object Classes (VOC) Challenge

International Journal of Computer Vision
Cascaded models for articulated pose estimation

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part II
Person spotting: video shot retrieval for face sets

CIVR'05 Proceedings of the 4th international conference on Image and Video Retrieval

Data-driven suggestions for portrait posing

SIGGRAPH Asia 2013 Emerging Technologies
Data-driven suggestions for portrait posing

SIGGRAPH Asia 2013 Technical Briefs

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe a method for real time video retrieval where the task is to match the 2D human pose of a query. A user can form a query by (i) interactively controlling a stickman on a web based GUI, (ii) uploading an image of the desired pose, or (iii) using the Kinect and acting out the query himself. The method is scalable and is applied to a dataset of 18 films totaling more than three million frames. The real time performance is achieved by searching for approximate nearest neighbors to the query using a random forest of K-D trees. Apart from the query modalities, we introduce two other areas of novelty. First, we show that pose retrieval can proceed using a low dimensional representation. Second, we show that the precision of the results can be improved substantially by combining the outputs of independent human pose estimation algorithms. The performance of the system is assessed quantitatively over a range of pose queries.