Section classification in clinical notes using supervised hidden markov model

  • Authors:
  • Ying Li;Sharon Lipsky Gorman;Noémie Elhadad

  • Affiliations:
  • Columbia University, New York, NY, USA;Columbia University, New York, NY, USA;Columbia University, New York, NY, USA

  • Venue:
  • Proceedings of the 1st ACM International Health Informatics Symposium
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

As more and more information is available in the Electronic Health Record in the form of free-text narrative, there is a need for automated tools, which can process and understand such texts. One first step towards the automated processing of clinical texts is to determine the document-level structure of a patient note, i.e., identifying the different sections and mapping them to known section types automatically. This paper considers section mapping as a sequence-labeling problem to 15 possible known section types. Our method relies on a Hidden Markov Model (HMM) trained on a corpus of 9,679 clinical notes from NewYork-Presbyterian Hospital. We compare our method to a state-of-the-art baseline, which ignores the sequential aspect of the sections and considers each section independently of the others in a note. Experiments show that our method outperforms the baseline significantly, yielding 93% accuracy in identifying sections individually and 70% accuracy in identifying all the sections in a note, compared to 70% and 19% for the baseline method respectively.