Fully utilize feedbacks: language model based relevance feedback in information retrieval

  • Authors:
  • Sheng-Long Lv;Zhi-Hong Deng;Hang Yu;Ning Gao;Jia-Jian Jiang

  • Affiliations:
  • Key Laboratory of Machine Perception (Ministry of Education), School of Electronics Engineering and Computer Science, Peking University, Beijing, China;Key Laboratory of Machine Perception (Ministry of Education), School of Electronics Engineering and Computer Science, Peking University, Beijing, China;Key Laboratory of Machine Perception (Ministry of Education), School of Electronics Engineering and Computer Science, Peking University, Beijing, China;Key Laboratory of Machine Perception (Ministry of Education), School of Electronics Engineering and Computer Science, Peking University, Beijing, China;Key Laboratory of Machine Perception (Ministry of Education), School of Electronics Engineering and Computer Science, Peking University, Beijing, China

  • Venue:
  • ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Relevance feedback algorithm is proposed to be an effective way to improve the precision of information retrieval. However, most researches about relevance feedback are based on vector space model, which can't be used in other more complicated and powerful models, such as language model and logic model. Meanwhile, other researches are conceptually restricted to the view of a query as a set of terms, and so cannot be naturally applied to more general case when the query is considered as a sequence of terms and the frequency information of a query tern is considered. In this paper, we mainly focuses on relevant feedback Algorithm based on language model. We use a mixture model to describe the process of generating document and use EM to solve model's parameters. Our research also employs semi-supervised learning to calculate collection model and proposes an effective way to obtain feedback from irrelevant documents to improve our algorithm.