Automatic query-time generation of retrieval expert coefficients for multimedia retrieval

  • Authors:
  • Peter Wilkins

  • Affiliations:
  • Dublin City University, Dublin, Ireland

  • Venue:
  • SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Content-based Multimedia Information Retrieval can be defined as the task of matching a multi-modal information need against various components of a multimedia corpus and retrieving relevant elements. Generally the matching and retrieval takes place across multiple 'features' which can either be visual or audio, or can be high-level or low-level, and each of which can be seen to be an independent retrieval expert. The task of answering a query can thus be formulated as a data fusion problem. Depending on the query, each expert may perform differently and so retrieval coefficients can be used to weight each expert to increase overall performance. Previous approaches to expert coefficient generation have included query-independent coefficients, identification of query-classes and machine learning methods, to name a few. The approach I propose is different, as it seeks to dynamically create expert coefficients which are query-dependent. This approach is based upon earlier experiments where an initial correlation was observed between the score distribution of a retrieval expert, and its relative performance when compared against other experts for that query. I have created a basic method which leverages these observations to create query-time coefficients which achieve comparable performance to oracle-determined query-independent weights, for the experts and collections used in the aforementioned experiment. Previous research which examinedscore distribution did so with respect to relevance, whereas this work seeks to compare expert scores for a given query to determine relative performance. In my work I aim to explore this correlation by eliminating potential bias from the data collections, the retrieval experts and the queries used in experiments to obtain more robust observations. Using and extending previous investigations into data fusion, I will explore where data fusion succeeds in multimedia retrieval, and where it does not. I then aim to refine and extend my existing techniques for automatic coefficient generation to incorporate the new observations, so as to improve performance. Finally I will combine this approach with existing data fusion methods, such as query-class coefficients, with each approach complimenting the other to achieve further performance improvements.