An expert study evaluating the UMLS lexical metaschema

  • Authors:
  • Li Zhang;George Hripcsak;Yehoshua Perl;Michael Halper;James Geller

  • Affiliations:
  • Computer Science Department, New Jersey Institute of Technology, University Heights, Newark, NJ 07102, USA;Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA;Computer Science Department, New Jersey Institute of Technology, University Heights, Newark, NJ 07102, USA;Mathematics and Computer Science Department, Kean University, Union, NJ 07083, USA;Computer Science Department, New Jersey Institute of Technology, University Heights, Newark, NJ 07102, USA

  • Venue:
  • Artificial Intelligence in Medicine
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Objective:: A metaschema is an abstraction network of the UMLS's semantic network (SN) obtained from a connected partition of its collection of semantic types. A lexical metaschema was previously derived based on a lexical partition which partitioned the SN into semantic-type groups using identical word-usage among the names of semantic types and the definitions of their respective children. In this paper, a statistical analysis methodology is presented to evaluate the lexical metaschema based on a study involving a group of established UMLS experts. Methods:: In the study, each expert was asked to identify subject areas of the SN based on his or her understanding of the various semantic types. For this purpose, the expert scans the SN hierarchy top-down, identifying semantic types, which are important and different enough from their parent semantic types, as roots of their groups. From the response of each expert, an ''expert metaschema'' is constructed. The different experts' metaschemas can vary widely. So, additional metaschemas are obtained from aggregations of the experts' responses. Of special interest is the consensus metaschema which represents an aggregation of a simple majority of the experts' responses. Statistical analysis comparing the lexical metaschema with the experts' metaschemas and the consensus metaschema is presented. Results:: The analysis results shows that 17 out of the 21 meta-semantic types in the lexical metaschema also appear in the consensus metaschema (about 81%). There are 107 semantic types (about 79%) covered by identical meta-semantic types and refinements. The results show the high similarity between the two metaschemas. Furthermore, the statistical analysis shows that the lexical metaschema did not grossly underperform compared to the experts. Conclusion:: Our study shows that the lexical metaschema provides a good approximation for a partition of meaningful subject areas in the SN, when compared to the consensus metaschema capturing the aggregation of a simple majority of the human experts' opinions.