Sequential result refinement for searching the biomedical literature

  • Authors:
  • L. Y. Tanaka;J. R. Herskovic;M. S. Iyengar;E. V. Bernstam

  • Affiliations:
  • School of Health Information Sciences, The University of Texas Health Science Center at Houston, 7000 Fannin Street, Suite 600, Houston, TX 77030, USA and Kapiolani Medical Center for Women & Chil ...;School of Health Information Sciences, The University of Texas Health Science Center at Houston, 7000 Fannin Street, Suite 600, Houston, TX 77030, USA;School of Health Information Sciences, The University of Texas Health Science Center at Houston, 7000 Fannin Street, Suite 600, Houston, TX 77030, USA;School of Health Information Sciences, The University of Texas Health Science Center at Houston, 7000 Fannin Street, Suite 600, Houston, TX 77030, USA and Department of Internal Medicine, The Univ ...

  • Venue:
  • Journal of Biomedical Informatics
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Information overload is a problem for users of MEDLINE, the database of biomedical literature that indexes over 17 million articles. Various techniques have been developed to retrieve high quality or important articles. Some techniques rely on using the number of citations as a measurement of an article's importance. Unfortunately, citation information is proprietary, expensive, and suffers from ''citation lag.'' MEDLINE users have a variety of information needs. Although some users require high recall, many users are looking for a ''few good articles'' on a topic. For these users, precision is more important than recall. We present and evaluate a method for identifying articles likely to be highly cited by using information available at the time of listing in MEDLINE. The method uses a score based on Medical Subject Headings (MeSH) terms, journal impact factor (JIF), and number of authors. This method can filter large MEDLINE result sets (1000 articles) returned by actual user queries to produce small, highly cited result sets.