Probabilistic term variant generator for biomedical terms

  • Authors:
  • Yoshimasa Tsuruoka;Jun'ichi Tsujii

  • Affiliations:
  • CREST, JST (Japan Science and Technology Corporation, Saitama, Japan and University of Tokyo, Tokyo, Japan;University of Tokyo, Tokyo, Japan and CREST, JST (Japan Science and Technology Corporation, Saitama, Japan

  • Venue:
  • Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents an algorithm to generate possible variants for biomedical terms. The algorithm gives each variant its generation probability representing its plausibility, which is potentially useful for query and dictionary expansions. The probabilistic rules for generating variants are automatically learned from raw texts using an existing abbreviation extraction technique. Our method, therefore, requires no linguistic knowledge or labor-intensive natural language resource. We conducted an experiment using 83,142 MEDLINE abstracts for rule induction and 18,930 abstracts for testing. The results indicate that our method will significantly increase the number of retrieved documents for long biomedical terms.