Modeling score distributions for combining the outputs of search engines
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
The score-distributional threshold optimization for adaptive binary classification tasks
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Advances in Informational Retrieval: Recent Research from the Center for Intelligent Information Retrieval
Fusion of effective retrieval strategies in the same information retrieval system
Journal of the American Society for Information Science and Technology
Using score distributions for query-time fusion in multimediaretrieval
MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Hi-index | 0.00 |
Content-based Multimedia Information Retrieval can be defined as the task of matching a multi-modal information need against various components of a multimedia corpus and retrieving relevant elements. Generally the matching and retrieval takes place across multiple 'features' which can either be visual or audio, or can be high-level or low-level, and each of which can be seen to be an independent retrieval expert. The task of answering a query can thus be formulated as a data fusion problem. Depending on the query, each expert may perform differently and so retrieval coefficients can be used to weight each expert to increase overall performance. Previous approaches to expert coefficient generation have included query-independent coefficients, identification of query-classes and machine learning methods, to name a few. The approach I propose is different, as it seeks to dynamically create expert coefficients which are query-dependent. This approach is based upon earlier experiments where an initial correlation was observed between the score distribution of a retrieval expert, and its relative performance when compared against other experts for that query. I have created a basic method which leverages these observations to create query-time coefficients which achieve comparable performance to oracle-determined query-independent weights, for the experts and collections used in the aforementioned experiment. Previous research which examinedscore distribution did so with respect to relevance, whereas this work seeks to compare expert scores for a given query to determine relative performance. In my work I aim to explore this correlation by eliminating potential bias from the data collections, the retrieval experts and the queries used in experiments to obtain more robust observations. Using and extending previous investigations into data fusion, I will explore where data fusion succeeds in multimedia retrieval, and where it does not. I then aim to refine and extend my existing techniques for automatic coefficient generation to incorporate the new observations, so as to improve performance. Finally I will combine this approach with existing data fusion methods, such as query-class coefficients, with each approach complimenting the other to achieve further performance improvements.