Learning the semantics of multimedia queries and concepts from a small number of examples

  • Authors:
  • Apostol (Paul) Natsev;Milind R. Naphade;Jelena TešiĆ

  • Affiliations:
  • IBM Watson Research Center, Hawthorne, NY;IBM Watson Research Center, Hawthorne, NY;IBM Watson Research Center, Hawthorne, NY

  • Venue:
  • Proceedings of the 13th annual ACM international conference on Multimedia
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we unify two supposedly distinct tasks in multimedia retrieval. One task involves answering queries with a few examples. The other involves learning models for semantic concepts, also with a few examples. In our view these two tasks are identical with the only differentiation being the number of examples that are available for training. Once we adopt this unified view, we then apply identical techniques for solving both problems and evaluate the performance using the NIST TRECVID benchmark evaluation data [15]. We propose a combination hypothesis of two complementary classes of techniques, a nearest neighbor model using only positive examples and a discriminative support vector machine model using both positive and negative examples. In case of queries, where negative examples are rarely provided to seed the search, we create pseudo-negative samples. We then combine the ranked lists generated by evaluating the test database using both methods, to create a final ranked list of retrieved multimedia items. We evaluate this approach for rare concept and query topic modeling using the NIST TRECVID video corpus.In both tasks we find that applying the combination hypothesis across both modeling techniques and a variety of features results in enhanced performance over any of the baseline models, as well as in improved robustness with respect to training examples and visual features. In particular, we observe an improvement of 6% for rare concept detection and 17% for the search task.