Using redundant bit vectors for near-duplicate image detection

  • Authors:
  • Jun Jie Foo;Ranjan Sinha

  • Affiliations:
  • School of Computer Science & IT, RMIT University, Melbourne, Australia;School of Computer Science & IT, RMIT University, Melbourne, Australia

  • Venue:
  • DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Images are amongst the most widely proliferated form of digital information due to affordable imaging technologies and the Web. In such an environment, the use of digital watermarking for image copyright infringement detection is a challenge. For such tasks, near-duplicate image detection is increasingly attractive due to its ability of automated content analysis; moreover, the application domain also extends to data management. The application of PCA-SIFT features and Locality-Sensitive Hashing (LSH) -- for indexing and retrieval -- has been shown to be highly effective for this task. In this work, we prune the number of PCA-SIFT features and introduce a modified Redundant Bit Vector (RBV) index. This is the first application of the RBV index that shows near-perfect effectiveness. Using the best parameters of our RBV approach, we observe an average recall and precision of 91% and 98%, respectively, with query response time of under 10 seconds on a collection of 20, 000 images. Compared to the baseline (the LSH index), the query response times and index size of the RBV index is 12 times faster and 126 times smaller, respectively. As compared to brute-force sequential scan, the RBV index rapidly reduces the search space to 1/80.