Image retrieval with geometry-preserving visual phrases

Authors:
Yimeng Zhang; Zhaoyin Jia; Tsuhan Chen
Affiliations:
Sch. of Electr. & Comput. Eng., Cornell Univ., Ithaca, NY, USA;Sch. of Electr. & Comput. Eng., Cornell Univ., Ithaca, NY, USA;Sch. of Electr. & Comput. Eng., Cornell Univ., Ithaca, NY, USA
Venue:
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Year:
2011

Citing 0
Cited 14

Feature grouping and local soft match for mobile visual search

Pattern Recognition Letters
Semantic parsing of street scenes from video

International Journal of Robotics Research
Embedding spatial context information into inverted filefor large-scale image retrieval

Proceedings of the 20th ACM international conference on Multimedia
Spatial pooling of heterogeneous features for image applications

Proceedings of the 20th ACM international conference on Multimedia
Improving bag-of-visual-words model with spatial-temporal correlation for video retrieval

Proceedings of the 21st ACM international conference on Information and knowledge management
Randomized spatial partition for scene recognition

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part II
Towards exhaustive pairwise matching in large image collections

ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume Part I
An efficient parallel strategy for matching visual self-similarities in large image databases

ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume Part I
SIFT match verification by geometric coding for large-scale partial-duplicate web image search

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
An efficient indexing method for content-based image retrieval

Neurocomputing
Image search—from thousands to billions in 20 years

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP) - Special Sections on the 20th Anniversary of ACM International Conference on Multimedia, Best Papers of ACM Multimedia 2012
Ranking consistency for image matching and object retrieval

Pattern Recognition
Generative Methods for Long-Term Place Recognition in Dynamic Scenes

International Journal of Computer Vision
Hough Pyramid Matching: Speeded-Up Geometry Re-ranking for Large Scale Image Retrieval

International Journal of Computer Vision

Quantified Score

Hi-index	0.00

Visualization

Abstract

The most popular approach to large scale image retrieval is based on the bag-of-visual-word (BoV) representation of images. The spatial information is usually re-introduced as a post-processing step to re-rank the retrieved images, through a spatial verification like RANSAC. Since the spatial verification techniques are computationally expensive, they can be applied only to the top images in the initial ranking. In this paper, we propose an approach that can encode more spatial information into BoV representation and that is efficient enough to be applied to large-scale databases. Other works pursuing the same purpose have proposed exploring the word co-occurrences in the neighborhood areas. Our approach encodes more spatial information through the geometry-preserving visual phrases (GVP). In addition to co-occurrences, the GVP method also captures the local and long-range spatial layouts of the words. Our GVP based searching algorithm increases little memory usage or computational time compared to the BoV method. Moreover, we show that our approach can also be integrated to the min-hash method to improve its retrieval accuracy. The experiment results on Oxford 5K and Flicker 1M dataset show that our approach outperforms the BoV method even following a RANSAC verification.