Towards low bit rate mobile visual search with multiple-channel coding

Authors:
Rongrong Ji;Ling-Yu Duan;Jie Chen;Hongxun Yao;Yong Rui;Shih-Fu Chang;Wen Gao
Affiliations:
Peking University, Beijing, China;Peking University, Beijing, China;Peking University, Beijing, China;Harbin Institute of Technology, Harbin, China;Microsoft China, Beijing, China;Columbia University, New York City, NY, China;Peking University, Beijing, China
Venue:
MM '11 Proceedings of the 19th ACM international conference on Multimedia
Year:
2011

Citing 19
Cited 10

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
Learning Patterns of Activity Using Real-Time Tracking

IEEE Transactions on Pattern Analysis and Machine Intelligence
Video Google: A Text Retrieval Approach to Object Matching in Videos

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
A Performance Evaluation of Local Descriptors

IEEE Transactions on Pattern Analysis and Machine Intelligence
Scalable Recognition with a Vocabulary Tree

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Image Based Localization in Urban Environments

3DPVT '06 Proceedings of the Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06)
Modeling and Recognition of Landmark Image Collections Using Iconic Scene Graphs

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part I
Image-Based Information Guide on Mobile Devices

ISVC '08 Proceedings of the 4th International Symposium on Advances in Visual Computing, Part II
Mapping the world's photos

Proceedings of the 18th international conference on World wide web
Tree Histogram Coding for Mobile Image Matching

DCC '09 Proceedings of the 2009 Data Compression Conference
Compression of image patches for local feature extraction

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Location coding for mobile image retrieval

Proceedings of the 5th International ICST Mobile Multimedia Communications Conference
A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems

SIAM Journal on Imaging Sciences
A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems

SIAM Journal on Imaging Sciences
Inverted Index Compression for Scalable Image Matching

DCC '10 Proceedings of the 2010 Data Compression Conference
PCA-SIFT: a more distinctive representation for local image descriptors

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
The stanford mobile visual search data set

MMSys '11 Proceedings of the second annual ACM conference on Multimedia systems
SURF: speeded up robust features

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part I

Finding perfect rendezvous on the go: accurate mobile visual localization and its applications to routing

Proceedings of the 20th ACM international conference on Multimedia
IMShare: instantly sharing your mobile landmark images by search-based reconstruction

Proceedings of the 20th ACM international conference on Multimedia
Local visual words coding for low bit rate mobile visual search

Proceedings of the 20th ACM international conference on Multimedia
Heritage app: annotating images on mobile phones

Proceedings of the Eighth Indian Conference on Computer Vision, Graphics and Image Processing
A recursive embedding algorithm towards lossless 2D vector map watermarking

Digital Signal Processing
Weakly supervised codebook learning by iterative label propagation with graph quantization

Signal Processing
Residual enhanced visual vector as a compact signature for mobile visual search

Signal Processing
Robust and accurate mobile visual localization and its applications

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP) - Special Sections on the 20th Anniversary of ACM International Conference on Multimedia, Best Papers of ACM Multimedia 2012
Listen, look, and gotcha: instant video search with mobile phones by layered audio-video indexing

Proceedings of the 21st ACM international conference on Multimedia
Where should I stand? Learning based human position recommendation for mobile photographing

Multimedia Tools and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose a multiple-channel coding scheme to extract compact visual descriptors for low bit rate mobile visual search. Different from previous visual search scenarios that send the query image, we make use of the ever growing mobile computational capability to directly extract compact visual descriptors at the mobile end. Meanwhile, stepping forward from the state-of-the-art compact descriptor extractions, we exploit the rich contextual cues at the mobile end (such as GPS tags for mobile visual search and 2D barcodes or RFID tags for mobile product search), together with the visual statistics at the reference database, to learn multiple coding channels. Therefore, we describe the query with one of many forms of high-dimensional visual signature, which is subsequently mapped to one or more channels and compressed. The compression function within each channel is learnt based on a novel robust PCA scheme, with specific consideration to preserve the retrieval ranking capability of the original signature. We have deployed our scheme on both iPhone4 and HTC DESIRE 7 to search ten million landmark images in a low bit rate setting. Quantitative comparisons to the state-of-the-arts demonstrate our significant advantages in descriptor compactness (with orders of magnitudes improvement) and retrieval mAP in mobile landmark, product, and CD/book cover search.