Developing feature types for classifying clinical notes

Authors:
Jon Patrick;Yitao Zhang;Yefeng Wang
Affiliations:
University of Sydney, Australia;University of Sydney, Australia;University of Sydney, Australia
Venue:
BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
Year:
2007

Citing 2
Cited 2

The nature of statistical learning theory

The nature of statistical learning theory
A maximum entropy approach to natural language processing

Computational Linguistics

Clinical text classification under the Open and Closed Topic Assumptions

International Journal of Data Mining and Bioinformatics
lexically-triggered hidden Markov models for clinical document coding

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a machine learning approach to the task of assigning the international standard on classification of diseases ICD-9-CM codes to clinical records. By treating the task as a text categorisation problem, a classification system was built which explores a variety of features including negation, different strategies of measuring gloss overlaps between the content of clinical records and ICD-9-CM code descriptions together with expansion of the glosses from the ICD-9-CM hierarchy. The best classifier achieved an overall F1 value of 88.2 on a data set of 978 free text clinical records, and was better than the performance of two out of three human annotators.