Auditing complex concepts of SNOMED using a refined hierarchical abstraction network

  • Authors:
  • Yue Wang;Michael Halper;Duo Wei;Huanying Gu;Yehoshua Perl;Junchuan Xu;Gai Elhanan;Yan Chen;Kent A. Spackman;James T. Case;George Hripcsak

  • Affiliations:
  • Computer Science Dept., New Jersey Institute of Technology, Newark, NJ 07102, USA;Information Technology Dept., New Jersey Institute of Technology, Newark, NJ 07102, USA;Computer Science and Information Systems, School of Business, The Richard Stockton College of New Jersey, Galloway, NJ 08205, USA;Computer Science Dept., New York Institute of Technology, New York, NY 10023, USA;Computer Science Dept., New Jersey Institute of Technology, Newark, NJ 07102, USA;Computer Science Dept., New Jersey Institute of Technology, Newark, NJ 07102, USA;Computer Science Dept., New Jersey Institute of Technology, Newark, NJ 07102, USA and Halfpenny Technologies, Inc. Blue Bell, PA 19422, USA;Computer Information Systems Dept., BMCC, CUNY New York, NY 10007, USA;IHTSDO 2300 Copenhagen S, Denmark;NLM/NIH Bethesda, MD 20817, USA;Dept. of Biomedical Informatics, Columbia University, New York, NY 10032, USA

  • Venue:
  • Journal of Biomedical Informatics
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Auditors of a large terminology, such as SNOMED CT, face a daunting challenge. To aid them in their efforts, it is essential to devise techniques that can automatically identify concepts warranting special attention. ''Complex'' concepts, which by their very nature are more difficult to model, fall neatly into this category. A special kind of grouping, called a partial-area, is utilized in the characterization of complex concepts. In particular, the complex concepts that are the focus of this work are those appearing in intersections of multiple partial-areas and are thus referred to as overlapping concepts. In a companion paper, an automatic methodology for identifying and partitioning the entire collection of overlapping concepts into disjoint, singly-rooted groups, that are more manageable to work with and comprehend, has been presented. The partitioning methodology formed the foundation for the development of an abstraction network for the overlapping concepts called a disjoint partial-area taxonomy. This new disjoint partial-area taxonomy offers a collection of semantically uniform partial-areas and is exploited herein as the basis for a novel auditing methodology. The review of the overlapping concepts is done in a top-down order within semantically uniform groups. These groups are themselves reviewed in a top-down order, which proceeds from the less complex to the more complex overlapping concepts. The results of applying the methodology to SNOMED's Specimen hierarchy are presented. Hypotheses regarding error ratios for overlapping concepts and between different kinds of overlapping concepts are formulated. Two phases of auditing the Specimen hierarchy for two releases of SNOMED are reported on. With the use of the double bootstrap and Fisher's exact test (two-tailed), the auditing of concepts and especially roots of overlapping partial-areas is shown to yield a statistically significant higher proportion of errors.