On the query reformulation technique for effective MEDLINE document retrieval

  • Authors:
  • Sooyoung Yoo;Jinwook Choi

  • Affiliations:
  • Medical Information Center, Seoul National University Bundang Hospital, 166 Gumi-Ro, Bundang-Gu, Seongnam-Si, Gyeonggi-Do 463-707, Republic of Korea;Department of Biomedical Engineering, College of Medicine, Seoul National University, 28 Yongon-Dong, Chongro-Gu, Seoul 110-799, Republic of Korea

  • Venue:
  • Journal of Biomedical Informatics
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Improving the retrieval accuracy of MEDLINE documents is still a challenging issue due to low retrieval precision. Focusing on a query expansion technique based on pseudo-relevance feedback (PRF), this paper addresses the problem by systematically examining the effects of expansion term selection and adjustment of the term weights of the expanded query using a set of MEDLINE test documents called OHSUMED. Implementing a baseline information retrieval system based on the Okapi BM25 retrieval model, we compared six well-known term ranking algorithms for useful expansion term selection and then compared traditional term reweighting algorithms with our new variant of the standard Rocchio's feedback formula, which adopts a group-based weighting scheme. Our experimental results on the OHSUMED test collection showed a maximum improvement of 20.2% and 20.4% for mean average precision and recall measures over unexpanded queries when terms were expanded using a co-occurrence analysis-based term ranking algorithm in conjunction with our term reweighting algorithm (p-value