Simple Weighting Techniques for Query Expansion in Biomedical Document Retrieval

Authors:
Young-In Song;Kyoung-Soo Han;So-Young Park;Sang-Bum Kim;Hae-Chang Rim
Affiliations:
-;-;-;-;-
Venue:
IEICE - Transactions on Information and Systems
Year:
2007

Citing 1
Cited 0

A probabilistic model of information retrieval: development and comparative experiments

Information Processing and Management: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose two weighting techniques to improve performances of query expansion in biomedical document retrieval, especially when a short biomedical term in a query is expanded with its synonymous multi-word terms. When a query contains synonymous terms of different lengths, a traditional IR model highly ranks a document containing a longer terminology because a longer terminology has more chance to be matched with a query. However, such preference is clearly inappropriate and it often yields an unsatisfactory result. To alleviate the bias weighting problem, we devise a method of normalizing the weights of query terms in a long multi-word biomedical term, and a method of discriminating terms by using inverse terminology frequency which is a novel statistics estimated in a query domain. The experiment results on MEDLINE corpus show that our two simple techniques improve the retrieval performance by adjusting the inadequate preference for long multi-word terminologies in an expanded query.