Active selection for multi-example querying by content

  • Authors:
  • A. P. Natsev;J. R. Smith

  • Affiliations:
  • IBM Thomas J. Watson Res. Center, Hawthorne, NY, USA;IBM Thomas J. Watson Res. Center, Hawthorne, NY, USA

  • Venue:
  • ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 2
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Multi-example content-based retrieval (MECBR) is the process of querying content by specifying multiple query examples with single query iteration. MECBR attempts to mitigate some of the semantic limitations of traditional relevance feedback or CBR techniques by allowing multiple query examples and thus a more accurate modeling of the user's query need. It also attempts to minimize the burden on the user, as compared to relevance feedback methods, by eliminating the need for user feedback and limiting all interaction into a single query specification step. Multi-example content-based retrieval is therefore a simple alternative for modeling low- and mid-level semantics without the need for heavy user interaction or extensive training, as in interactive feedback systems or complex statistical modeling approaches. In this paper, we describe the MECBR technique in some detail and study methods for active selection of query examples and query features. In particular, we propose and investigate techniques for automatic query example selection, feature selection, and feature fusion. We compare different approaches and evaluate performance of different parameter settings through an extensive empirical study. We also compare MECBR performance to that of explicitly built semantic models using state-of-the-art support vector machines (SVM). We find that lightweight MECBR performs up to 60% better for rare concepts and only 12% to 25% worse for frequent concepts, as compared to heavy-weight SVM modeling! This shows that MECBR is not only a viable lightweight alternative to statistical semantic modeling but is also preferred for very diverse or rare-class semantic modeling situations.