Retrieval evaluation and distance learning from perceived similarity between endomicroscopy videos

  • Authors:
  • Barbara André;Tom Vercauteren;Anna M. Buchner;Michael B. Wallace;Nicholas Ayache

  • Affiliations:
  • Mauna Kea Technologies, Paris and INRIA - Asclepios, Sophia-Antipolis;Mauna Kea Technologies, Paris;Hospital of the University of Pennsylvania, Philadelphia;Mayo Clinic, Jacksonville, Florida;INRIA - Asclepios, Sophia-Antipolis

  • Venue:
  • MICCAI'11 Proceedings of the 14th international conference on Medical image computing and computer-assisted intervention - Volume Part III
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Evaluating content-based retrieval (CBR) is challenging because it requires an adequate ground-truth. When the available groundtruth is limited to textual metadata such as pathological classes, retrieval results can only be evaluated indirectly, for example in terms of classification performance. In this study we first present a tool to generate perceived similarity ground-truth that enables direct evaluation of endomicroscopic video retrieval. This tool uses a four-points Likert scale and collects subjective pairwise similarities perceived by multiple expert observers. We then evaluate against the generated ground-truth a previously developed dense bag-of-visual-words method for endomicroscopic video retrieval. Confirming the results of previous indirect evaluation based on classification, our direct evaluation shows that this method significantly outperforms several other state-of-the-art CBR methods. In a second step, we propose to improve the CBR method by learning an adjusted similarity metric from the perceived similarity ground-truth. By minimizing a margin-based cost function that differentiates similar and dissimilar video pairs, we learn a weight vector applied to the visual word signatures of videos. Using cross-validation, we demonstrate that the learned similarity distance is significantly better correlated with the perceived similarity than the original visual-word-based distance.