Incorporating statistical topic information in relevance feedback

  • Authors:
  • Karla L. Caballero;Ram Akella

  • Affiliations:
  • University of California Santa Cruz, Santa Cruz, CA, USA;University of California Santa Cruz, Santa Cruz, CA, USA

  • Venue:
  • SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most of the relevance feedback algorithms only use document terms as feedback (local features) in order to update the query and re-rank the documents to show to the user. This approach is limited by the terms of those documents without any global context. We propose to use statistical topic modeling techniques in relevance feedback to incorporate a better estimate of context by including global information about the document. This is particularly helpful for difficult queries where learning the context from the interactions with the user is crucial. We propose to use the topic mixture information obtained to characterize the documents and learn their topics. Then, we rank documents incorporating positive and negative feedback by fitting a latent distribution for each class of documents online and combining all the features using Bayesian Logistic Regression. We show results using the OHSUMED dataset for 3 different variants and obtain higher performance, up to 12.5% in Mean Average Precision (MAP).