A recent advance in the automatic indexing of the biomedical literature

  • Authors:
  • Aurélie Névéol;Sonya E. Shooshan;Susanne M. Humphrey;James G. Mork;Alan R. Aronson

  • Affiliations:
  • National Institutes of Health, US National Library of Medicine, 8600 Rockville Pike, Bethesda, MD 20894, USA;National Institutes of Health, US National Library of Medicine, 8600 Rockville Pike, Bethesda, MD 20894, USA;National Institutes of Health, US National Library of Medicine, 8600 Rockville Pike, Bethesda, MD 20894, USA;National Institutes of Health, US National Library of Medicine, 8600 Rockville Pike, Bethesda, MD 20894, USA;National Institutes of Health, US National Library of Medicine, 8600 Rockville Pike, Bethesda, MD 20894, USA

  • Venue:
  • Journal of Biomedical Informatics
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The volume of biomedical literature has experienced explosive growth in recent years. This is reflected in the corresponding increase in the size of MEDLINE^(R), the largest bibliographic database of biomedical citations. Indexers at the US National Library of Medicine (NLM) need efficient tools to help them accommodate the ensuing workload. After reviewing issues in the automatic assignment of Medical Subject Headings (MeSH^(R) terms) to biomedical text, we focus more specifically on the new subheading attachment feature for NLM's Medical Text Indexer (MTI). Natural Language Processing, statistical, and machine learning methods of producing automatic MeSH main heading/subheading pair recommendations were assessed independently and combined. The best combination achieves 48% precision and 30% recall. After validation by NLM indexers, a suitable combination of the methods presented in this paper was integrated into MTI as a subheading attachment feature producing MeSH indexing recommendations compliant with current state-of-the-art indexing practice.