SALSAS: Sub-linear active learning strategy with approximate k-NN search

  • Authors:
  • D. Gorisse;M. Cord;F. Precioso

  • Affiliations:
  • ETIS, ENSEA/CNRS UMR8051/Univ. Cergy-Pontoise, France;LIP6, UPMC - PARIS VI, France;ETIS, ENSEA/CNRS UMR8051/Univ. Cergy-Pontoise, France

  • Venue:
  • Pattern Recognition
  • Year:
  • 2011

Quantified Score

Hi-index 0.01

Visualization

Abstract

With the democratization of digital imaging devices, image databases exponentially grow. Thus, providing the user with a system for searching into these databases is a critical issue. However, bridging the semantic gap between which (semantic) concept(s) the user is looking for and the (semantic) content is quite difficult. In content-based image retrieval (CBIR) systems, a classic scenario is to formulate the user query, at first, with only one example (i.e. one image). In order to address this problem, active learning is a powerful technique which involves the user in interactively refining the query concept, through relevance feedback loops, by asking the user whether some strategically selected images are relevant or not. However, the complexity of state-of-the-art active learning methods is linear in the size of the database and thus dramatically slows down retrieval systems, when dealing with very large databases, which is no longer acceptable for users. In this article, we propose a strategy to overcome scalability limitations of active learning strategies by exploiting ultra fast k-nearest-neighbor (k-NN) methods, as locality sensitive hashing (LSH), and combining them with an active learning strategy dedicated to very large databases. We define a new LSH scheme adapted to @g^2 distance which often leads to better results in image retrieval context. We perform evaluation on databases between 5K and 180K images. The results show that our interactive retrieval system has a complexity almost constant in the size of the database. For a database of 180K images, our system is 45 times faster than exhaustive search (linear scan) reaching similar accuracy.