Fast and simple semantic class assignment for biomedical text

  • Authors:
  • K. Bretonnel Cohen;Tom Christiansen;William A. Baumgartner, Jr.;Karin Verspoor;Lawrence E. Hunter

  • Affiliations:
  • Computational Bioscience Program, U. Colorado School of Medicine, U. of Colorado at Boulder;Comput. Bioscience Prog., U. Colorado Sch. of Medicine;Computational Bioscience Program, U. Colorado School of Medicine;Computational Bioscience Program, U. Colorado School of Medicine;Computational Bioscience Program, U. Colorado School of Medicine

  • Venue:
  • BioNLP '11 Proceedings of BioNLP 2011 Workshop
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

A simple and accurate method for assigning broad semantic classes to text strings is presented. The method is to map text strings to terms in ontologies based on a pipeline of exact matches, normalized strings, headword matching, and stemming headwords. The results of three experiments evaluating the technique are given. Five semantic classes are evaluated against the CRAFT corpus of full-text journal articles. Twenty semantic classes are evaluated against the corresponding full ontologies, i.e. by reflexive matching. One semantic class is evaluated against a structured test suite. Precision, recall, and F-measure on the corpus when evaluating against only the ontologies in the corpus is micro-averaged 67.06/78.49/72.32 and macro-averaged 69.84/83.12/75.31. Accuracy on the corpus when evaluating against all twenty semantic classes ranges from 77.12% to 95.73%. Reflexive matching is generally successful, but reveals a small number of errors in the implementation. Evaluation with the structured test suite reveals a number of characteristics of the performance of the approach.