Towards low bit rate mobile visual search with multiple-channel coding

  • Authors:
  • Rongrong Ji;Ling-Yu Duan;Jie Chen;Hongxun Yao;Yong Rui;Shih-Fu Chang;Wen Gao

  • Affiliations:
  • Peking University, Beijing, China;Peking University, Beijing, China;Peking University, Beijing, China;Harbin Institute of Technology, Harbin, China;Microsoft China, Beijing, China;Columbia University, New York City, NY, China;Peking University, Beijing, China

  • Venue:
  • MM '11 Proceedings of the 19th ACM international conference on Multimedia
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose a multiple-channel coding scheme to extract compact visual descriptors for low bit rate mobile visual search. Different from previous visual search scenarios that send the query image, we make use of the ever growing mobile computational capability to directly extract compact visual descriptors at the mobile end. Meanwhile, stepping forward from the state-of-the-art compact descriptor extractions, we exploit the rich contextual cues at the mobile end (such as GPS tags for mobile visual search and 2D barcodes or RFID tags for mobile product search), together with the visual statistics at the reference database, to learn multiple coding channels. Therefore, we describe the query with one of many forms of high-dimensional visual signature, which is subsequently mapped to one or more channels and compressed. The compression function within each channel is learnt based on a novel robust PCA scheme, with specific consideration to preserve the retrieval ranking capability of the original signature. We have deployed our scheme on both iPhone4 and HTC DESIRE 7 to search ten million landmark images in a low bit rate setting. Quantitative comparisons to the state-of-the-arts demonstrate our significant advantages in descriptor compactness (with orders of magnitudes improvement) and retrieval mAP in mobile landmark, product, and CD/book cover search.