A vocabulary development and visualization tool based on natural language processing and the mining of textural patient reports

  • Authors:
  • Carol Friedman;Hongfang Liu;Lyudmila Shagina

  • Affiliations:
  • Department of Medical Informatics, Columbia University, 622 West 168 Street, VC-5 Bldg, New York, NY;Department of Medical Informatics, Columbia University, 622 West 168 Street, VC-5 Bldg, New York, NY;Department of Medical Informatics, Columbia University, 622 West 168 Street, VC-5 Bldg, New York, NY

  • Venue:
  • Journal of Biomedical Informatics
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Medical terminologies are critical for automated healthcare systems. Some terminologies, such as the UMLS and SNOMED are comprehensive, whereas others specialize in limited domains (i.e., BIRADS) or are developed for specific applications. An important feature of a terminology is comprehensive coverage of relevant clinical terms and ease of use by users, which include computerized applications. We have developed a method for facilitating vocabulary development and maintenance that is based on utilization of natural language processing to mine large collections of clinical reports in order to obtain information on terminology as expressed by physicians. Once the reports are processed and the terms structured and collected into an XML representational schema, it is possible to determine information about terms, such as frequency of occurrence, compositionality, relations to other terms (such as modifiers), and correspondence to a controlled vocabulary. This paper describes the method and discusses how it can be used as a tool to help vocabulary builders navigate through the terms physicians use, visualize their relations to other terms via a flexible viewer, and determine their correspondence to a controlled vocabulary.