Actions in stillweb images: visualization, detection and retrieval

Authors:
Piji Li;Jun Ma;Shuai Gao
Affiliations:
School of Computer Science & Technology, Shandong University, Jinan, China;School of Computer Science & Technology, Shandong University, Jinan, China;School of Computer Science & Technology, Shandong University, Jinan, China
Venue:
WAIM'11 Proceedings of the 12th international conference on Web-age information management
Year:
2011

Citing 14
Cited 0

Modern Information Retrieval

Modern Information Retrieval
Histograms of Oriented Gradients for Human Detection

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
A survey of advances in vision-based human motion capture and analysis

Computer Vision and Image Understanding - Special issue on modeling people: Vision-based understanding of a person's shape, appearance, movement, and behaviour
Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words

International Journal of Computer Vision
Visual diversification of image search results

Proceedings of the 18th international conference on World wide web
Web image retrieval reranking with multi-view clustering

Proceedings of the 18th international conference on World wide web
Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
Lightweight web image reranking

MM '09 Proceedings of the 17th ACM international conference on Multimedia
The Pascal Visual Object Classes (VOC) Challenge

International Journal of Computer Vision
Active reranking for web image search

IEEE Transactions on Image Processing
Dual-ranking for web image retrieval

Proceedings of the ACM International Conference on Image and Video Retrieval
Object Detection with Discriminatively Trained Part-Based Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)

Quantified Score

Hi-index	0.01

Visualization

Abstract

We describe a framework for human action retrieval in still web images by verb queries, for instance "phoning". Firstly, we build a group of visual discriminative instances for each action class, called "Exemplarlets". Thereafter we employ Multiple Kernel Learning (MKL) to learn an optimal combination of histogram intersection kernels, each of which captures a state-of-the-art feature channel. Our features include the distribution of edges, dense visual words and feature descriptors at different levels of spatial pyramid. For a new image we can detect the hot-region using a sliding-window detector learnt via MKL. The hotregion can imply latent actions in the image. After the hot-region has been detected, we build a inverted index in the visual search path, which we called Visual Inverted Index (VII). Finally, fusing the visual search path and the text search path, we can get the accurate results either relevant to text or to visual information. We show both the detection and retrieval results on our newly collected dataset of six actions as well as demonstrate improved performance over existing methods.