Automatic extraction of semantic content from medical discharge records

  • Authors:
  • György Szarvas;Szilárd Iván;András Bánhalmi;János Csirik

  • Affiliations:
  • University of Szeged, Department of Informatics, Szeged, Hungary;University of Szeged, Department of Informatics, Szeged, Hungary;University of Szeged, Department of Informatics, Szeged, Hungary;Hungarian Academy of Sciences, Research Group on Artificial Intelligence, Szeged, Hungary

  • Venue:
  • ICOSSE'06 Proceedings of the 5th WSEAS international conference on System science and simulation in engineering
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Semi-structured medical texts like discharge summaries are rich sources of information that can exploit the research results of physicians by performing statistical analysis of similar cases. In this paper we introduce a system based on Machine Learning algorithms that successfully classifies discharge records according to the smoking status of the patient (we distinguish between current smoker, past smoker, smoker /where a decision between the former two classes cannot be made/, non-smoker and unknown /where the document contains no data on smoking status/ classes). Such systems are useful for examining the connection between certain social habits and diseases like cancer or asthma. We trained and tested our model on the shared task organized by the I2B2 (Informatics for Integrating Biology and the Bedside) research center [1], and despite the low amount of training data available, our system shows promising results in identifying the smoking habits of patients based on their medical discharge summaries.