Exploring social annotation tags to enhance information retrieval performance

  • Authors:
  • Zheng Ye;Xiangji Jimmy Huang;Song Jin;Hongfei Lin

  • Affiliations:
  • School of Information Technology York University, Toronto, Ontario, Canada and Department of Computer Science and Engineering, Dalian University of Technology Dalian, Liaoning, China;School of Information Technology York University, Toronto, Ontario, Canada;Department of Computer Science and Engineering, Dalian University of Technology Dalian, Liaoning, China;Department of Computer Science and Engineering, Dalian University of Technology Dalian, Liaoning, China

  • Venue:
  • AMT'10 Proceedings of the 6th international conference on Active media technology
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Pseudo relevance feedback (PRF) via query expansion has proven to be effective in many information retrieval tasks. Most existing approaches are based on the assumption that the most informative terms in top-ranked documents from the first-pass retrieval can be viewed as the context of the query, and thus can be used to specify the information need. However, there may be irrelevant documents used in PRF (especially for hard topics), which can bring noise into the feedback process. The recent development of Web 2.0 technologies on Internet has provided an opportunity to enhance PRF as more and more high-quality resources can be freely obtained. In this paper, we propose a generative model to select high-quality feedback terms from social annotation tags. The main advantages of our proposed feedback model are as follows. First, our model explicitly explains how each feedback term is generated. Second, our model can take advantage of the human-annotated semantic relationship among terms. Experimental results on three TREC test datasets show that social annotation tags can be used as a good external resource for PRF. It is as good as the top-ranked documents from first-pass retrieval with optimal parameter setting on the WSJ dataset. When we combine the top-ranked documents and the social annotation tags, the retrieval performance can be further improved.