Developing feature types for classifying clinical notes

  • Authors:
  • Jon Patrick;Yitao Zhang;Yefeng Wang

  • Affiliations:
  • University of Sydney, Australia;University of Sydney, Australia;University of Sydney, Australia

  • Venue:
  • BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes a machine learning approach to the task of assigning the international standard on classification of diseases ICD-9-CM codes to clinical records. By treating the task as a text categorisation problem, a classification system was built which explores a variety of features including negation, different strategies of measuring gloss overlaps between the content of clinical records and ICD-9-CM code descriptions together with expansion of the glosses from the ICD-9-CM hierarchy. The best classifier achieved an overall F1 value of 88.2 on a data set of 978 free text clinical records, and was better than the performance of two out of three human annotators.