Interpretable visual models for human perception-based object retrieval

  • Authors:
  • Ahmed Rebai;Alexis Joly;Nozha Boujemaa

  • Affiliations:
  • INRIA Rocquencourt, France;INRIA Rocquencourt, France;INRIA Rocquencourt, France

  • Venue:
  • Proceedings of the 1st ACM International Conference on Multimedia Retrieval
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Understanding the results returned by automatic visual concept detectors is often a tricky task making users uncomfortable with these technologies. In this paper we attempt to build humanly interpretable visual models, allowing the user to visually understand the underlying semantic. We therefore propose a supervised multiple instance learning algorithm that selects as few as possible discriminant local features for a given object category. The method finds its roots in the lasso theory where a L1-regularization term is introduced in order to constraint the loss function, and subsequently produce sparser solutions. Efficient resolution of the lasso path is achieved through a boosting-like procedure inspired by BLasso algorithm. Quantitatively, our method achieves similar performance as current state-of-the-art, and qualitatively, it allows users to construct their own model from the original set of patches learned, thus allowing for more compound semantic queries.