Integration Of Structured And Unstructured Text Data In A Clinical Information System

  • Authors:
  • Ching-Song D. Wei;Sam Y. Sung;Simon J. Doong;Peter A. Ng

  • Affiliations:
  • Department of Computer Information Systems, BMCC, City University of New York, New York, USA;Department of Computer Science, South Texas College of Science and Technology, McAllen, TX, USA;Department of Information Management, China University of Technology, Taipei, Taiwan;Department of Computer Science, University of Texas-Pan American, Edinburg, TX, USA

  • Venue:
  • Journal of Integrated Design & Process Science
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes an integration of structured and unstructured text data in a Clinical Information System (CIS). The input and output data of this Clinical Information System depend heavily on the use of a medical database and an intensive Graphic User Interface (GUI). The data source consists of multiple GUI's fields of various forms which compose structured data and free-text medical reports of patients. These free text medical reports are mostly made of the complicated part of unstructured text data. Usually, the free text medical reports include reports on patient intake, examination, and discharge, which could be an input from a keyboard, a handwriting device, a voice microphone, transcription service or others. Some of the medical language extraction and encoding systems, such as MedLEE, can be used to translate the medical free text into a standard and structured format. These systems will be reviewed and adopted to perform the knowledge extraction from medical free texts, if applicable. To achieve the goal of data sharing for the future application in data interoperability between heterogeneous database systems, a clinical data repository can be stored which is formed by applying a standard-compliant eXtensible Markup Language (XML), with GUI form, for integrating these data. Currently there are a number of healthcare information standards, such as HL7 (Health Level 7, 2006) and Medical Markup Language (Medical Markup Language, 2006) to promote data exchange. But it requires that both the sender and receiver of the data use the same standard in order to achieve the data exchange goal. In this paper, the generic data XML model is therefore developed and the feasibility for transforming the model into any of the standards is also demonstrated.