Novelty detection in patient histories: experiments with measures based on text compression

  • Authors:
  • Ole Edsberg;Øystein Nytrø;Thomas Brox Røst

  • Affiliations:
  • Norwegian University of Science and Technology, Department of Computer and Information Science, Trondheim, Norway;Norwegian University of Science and Technology, Department of Computer and Information Science, Trondheim, Norway;Norwegian University of Science and Technology, Department of Computer and Information Science, Trondheim, Norway

  • Venue:
  • IDA'07 Proceedings of the 7th international conference on Intelligent data analysis
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Reviewing a patient history can be very time consuming, partly because of the large number of consultation notes. Often, most of the notes contain little new information. Tools facilitating this and other tasks could be constructed if we had the ability to automatically detect the novel notes. We propose the use of measures based on text compression, as an approximation of Kolmogorov complexity, for classifying note novelty. We define four compression-based and eight other measures. We evaluate their ability to predict the presence of previously unseen diagnosis codes associated with the notes in patient histories from general practice. The best measures show promising classification ability, which, while not enough to serve alone as a clinical tool, might be useful as part of a system taking more data types into account. The best individual measure was the normalized asymmetric compression distance between the concatenated prior notes and the current note.