A study of terminology auditors' performance for UMLS semantic type assignments

  • Authors:
  • Huanying (Helen) Gu;Gai Elhanan;Yehoshua Perl;George Hripcsak;James J. Cimino;Julia Xu;Yan Chen;James Geller;C. Paul Morrey

  • Affiliations:
  • New York Institute of Technology, New York, NY, United States;New Jersey Institute of Technology, Newark, NJ, United States;New Jersey Institute of Technology, Newark, NJ, United States;Columbia University, New York, NY, United States;NIH Clinical Center, Bethesda, MD, United States;NIH Clinical Center, Bethesda, MD, United States;BMCC, City University of New York, New York, NY, United States;New Jersey Institute of Technology, Newark, NJ, United States;Utah Valley University, Orem, UT, United States

  • Venue:
  • Journal of Biomedical Informatics
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Auditing healthcare terminologies for errors requires human experts. In this paper, we present a study of the performance of auditors looking for errors in the semantic type assignments of complex UMLS concepts. In this study, concepts are considered complex whenever they are assigned combinations of semantic types. Past research has shown that complex concepts have a higher likelihood of errors. The results of this study indicate that individual auditors are not reliable when auditing such concepts and their performance is low, according to various metrics. These results confirm the outcomes of an earlier pilot study. They imply that to achieve an acceptable level of reliability and performance, when auditing such concepts of the UMLS, several auditors need to be assigned the same task. A mechanism is then needed to combine the possibly differing opinions of the different auditors into a final determination. In the current study, in contrast to our previous work, we used a majority mechanism for this purpose. For a sample of 232 complex UMLS concepts, the majority opinion was found reliable and its performance for accuracy, recall, precision and the F-measure was found statistically significantly higher than the average performance of individual auditors.