Feature grouping and local soft match for mobile visual search
Pattern Recognition Letters
Semantic parsing of street scenes from video
International Journal of Robotics Research
Embedding spatial context information into inverted filefor large-scale image retrieval
Proceedings of the 20th ACM international conference on Multimedia
Spatial pooling of heterogeneous features for image applications
Proceedings of the 20th ACM international conference on Multimedia
Improving bag-of-visual-words model with spatial-temporal correlation for video retrieval
Proceedings of the 21st ACM international conference on Information and knowledge management
Randomized spatial partition for scene recognition
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part II
Towards exhaustive pairwise matching in large image collections
ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume Part I
An efficient parallel strategy for matching visual self-similarities in large image databases
ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume Part I
SIFT match verification by geometric coding for large-scale partial-duplicate web image search
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Image search—from thousands to billions in 20 years
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP) - Special Sections on the 20th Anniversary of ACM International Conference on Multimedia, Best Papers of ACM Multimedia 2012
Ranking consistency for image matching and object retrieval
Pattern Recognition
Generative Methods for Long-Term Place Recognition in Dynamic Scenes
International Journal of Computer Vision
Hough Pyramid Matching: Speeded-Up Geometry Re-ranking for Large Scale Image Retrieval
International Journal of Computer Vision
Hi-index | 0.00 |
The most popular approach to large scale image retrieval is based on the bag-of-visual-word (BoV) representation of images. The spatial information is usually re-introduced as a post-processing step to re-rank the retrieved images, through a spatial verification like RANSAC. Since the spatial verification techniques are computationally expensive, they can be applied only to the top images in the initial ranking. In this paper, we propose an approach that can encode more spatial information into BoV representation and that is efficient enough to be applied to large-scale databases. Other works pursuing the same purpose have proposed exploring the word co-occurrences in the neighborhood areas. Our approach encodes more spatial information through the geometry-preserving visual phrases (GVP). In addition to co-occurrences, the GVP method also captures the local and long-range spatial layouts of the words. Our GVP based searching algorithm increases little memory usage or computational time compared to the BoV method. Moreover, we show that our approach can also be integrated to the min-hash method to improve its retrieval accuracy. The experiment results on Oxford 5K and Flicker 1M dataset show that our approach outperforms the BoV method even following a RANSAC verification.