Automated identification of synonyms in biomedical acronym sense inventories

  • Authors:
  • Genevieve B. Melton;SungRim Moon;Bridget McInnes;Serguei Pakhomov

  • Affiliations:
  • University of Minnesota, Minneapolis, MN;University of Minnesota, Minneapolis, MN;University of Minnesota, Minneapolis, MN;University of Minnesota, Minneapolis, MN

  • Venue:
  • Louhi '10 Proceedings of the NAACL HLT 2010 Second Louhi Workshop on Text and Data Mining of Health Documents
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Acronyms are increasingly prevalent in biomedical text, and the task of acronym disambiguation is fundamentally important for biomedical natural language processing systems. Several groups have generated sense inventories of acronym long form expansions from the biomedical literature. Long form sense inventories, however, may contain conceptually redundant expansions that negatively affect their quality. Our approach to improving sense inventories consists of mapping long form expansions to concepts in the Unified Medical Language System (UMLS) with subsequent application of a semantic similarity algorithm based upon conceptual overlap. We evaluated this approach on a reference standard developed for ten acronyms. A total of 119 of 155 (78%) long forms mapped to concepts in the UMLS. Our approach identified synonymous long forms with a sensitivity of 70.2% and a positive predictive value of 96.3%. Although further refinements are needed, this study demonstrates the potential value of using automated techniques to merge synonymous biomedical acronym long forms to improve the quality of biomedical acronym sense inventories.