Using Existing Biomedical Resources to Detect and Ground Terms in Biomedical Literature

  • Authors:
  • Kaarel Kaljurand;Fabio Rinaldi;Thomas Kappeler;Gerold Schneider

  • Affiliations:
  • Institute of Computational Linguistics, University of Zurich,;Institute of Computational Linguistics, University of Zurich,;Institute of Computational Linguistics, University of Zurich,;Institute of Computational Linguistics, University of Zurich,

  • Venue:
  • AIME '09 Proceedings of the 12th Conference on Artificial Intelligence in Medicine: Artificial Intelligence in Medicine
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present an approach towards the automatic detection of names of proteins, genes, species, etc. in biomedical literature and their grounding to widely accepted identifiers. The annotation is based on a large term list that contains the common expression of the terms, a normalization step that matches the terms with their actual representation in the texts, and a disambiguation step that resolves the ambiguity of matched terms. We describe various characteristics of the terms found in existing term resources and of the terms that are used in biomedical texts. We evaluate our results against a corpus of manually annotated protein mentions and achieve a precision of 57% and recall of 72%.