Evaluation of preprocessing techniques for chief complaint classification

Authors:
Jagan Dara;John N. Dowling;Debbie Travers;Gregory F. Cooper;Wendy W. Chapman
Affiliations:
Department of Biomedical Informatics, University of Pittsburgh, 200 Meyran Avenue, VALE M-183, Pittsburgh, PA 15260, USA;Department of Biomedical Informatics, University of Pittsburgh, 200 Meyran Avenue, VALE M-183, Pittsburgh, PA 15260, USA;School of Nursing, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA;Department of Biomedical Informatics, University of Pittsburgh, 200 Meyran Avenue, VALE M-183, Pittsburgh, PA 15260, USA;Department of Biomedical Informatics, University of Pittsburgh, 200 Meyran Avenue, VALE M-183, Pittsburgh, PA 15260, USA
Venue:
Journal of Biomedical Informatics
Year:
2008

Citing 1
Cited 3

Using nurses' natural language entries to build a concept-oriented terminology for patients' chief complaints in the emergency department

Journal of Biomedical Informatics - Special issue: Building nursing knowledge through infomatics: from concept representation to data mining

Subword variation in text message classification

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Automated syndrome classification using early phase emergency department data

Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
Methodological Review: Using chief complaints for syndromic surveillance: A review of chief complaint based classifiers in North America

Journal of Biomedical Informatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Objective:: To determine whether preprocessing chief complaints before automatically classifying them into syndromic categories improves classification performance. Methods:: We preprocessed chief complaints using two preprocessors (CCP and EMT-P) and evaluated whether classification performance increased for a probabilistic classifier (CoCo) or for a keyword-based classifier (modification of the NYC Department of Health and Mental Hygiene chief complaint coder (KC)). Results:: CCP exhibited high accuracy (85%) in preprocessing chief complaints but only slightly improved CoCo's classification performance for a few syndromes. EMT-P, which splits chief complaints into multiple problems, substantially increased CoCo's sensitivity for all syndromes. Preprocessing with CCP or EMT-P only improved KC's sensitivity for the Constitutional syndrome. Conclusion:: Evaluation of preprocessing systems should not be limited to accuracy of the preprocessor but should include the effect of preprocessing on syndromic classification. Splitting chief complaints into multiple problems before classification is important for CoCo, but other preprocessing steps only slightly improved classification performance for CoCo and a keyword-based classifier.