Building an automated SOAP classifier for emergency department reports

Authors:
Danielle Mowery;Janyce Wiebe;Shyam Visweswaran;Henk Harkema;Wendy W. Chapman
Affiliations:
Department of Biomedical Informatics, University of Pittsburgh, Parkvale Building M-183, 200 Meyran Avenue, Pittsburgh, PA 15260, USA;Intelligent Systems Program, University of Pittsburgh, 5113 Sennott Square, 210 South Bouquet Street, Pittsburgh, PA 15260, USA;Department of Biomedical Informatics, University of Pittsburgh, Parkvale Building M-183, 200 Meyran Avenue, Pittsburgh, PA 15260, USA and Intelligent Systems Program, University of Pittsburgh, 511 ...;Department of Biomedical Informatics, University of Pittsburgh, Parkvale Building M-183, 200 Meyran Avenue, Pittsburgh, PA 15260, USA;Department of Biomedical Informatics, University of Pittsburgh, Parkvale Building M-183, 200 Meyran Avenue, Pittsburgh, PA 15260, USA and Division of Biomedical Informatics, University of Californ ...
Venue:
Journal of Biomedical Informatics
Year:
2012

Citing 13
Cited 0

The medical archival system: an information retrieval system based on distributed parallel processing

Information Processing and Management: an International Journal - Special issue on parallel processing and information retrieval
Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
Support-Vector Networks

Machine Learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Measuring agreement in medical informatics reliability studies

Journal of Biomedical Informatics
The kappa statistic: a second look

Computational Linguistics
Sequence modelling for sentence classification in a legal summarisation system

Proceedings of the 2005 ACM symposium on Applied computing
A statistical model for domain-independent text segmentation

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Inter-coder agreement for computational linguistics

Computational Linguistics
Distinguishing historical from current problems in clinical reports: which textual features help?

BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
ConText: An algorithm for determining negation, experiencer, and temporal status from clinical reports

Journal of Biomedical Informatics
Selecting information in electronic health records for knowledge acquisition

Journal of Biomedical Informatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Information extraction applications that extract structured event and entity information from unstructured text can leverage knowledge of clinical report structure to improve performance. The Subjective, Objective, Assessment, Plan (SOAP) framework, used to structure progress notes to facilitate problem-specific, clinical decision making by physicians, is one example of a well-known, canonical structure in the medical domain. Although its applicability to structuring data is understood, its contribution to information extraction tasks has not yet been determined. The first step to evaluating the SOAP framework's usefulness for clinical information extraction is to apply the model to clinical narratives and develop an automated SOAP classifier that classifies sentences from clinical reports. In this quantitative study, we applied the SOAP framework to sentences from emergency department reports, and trained and evaluated SOAP classifiers built with various linguistic features. We found the SOAP framework can be applied manually to emergency department reports with high agreement (Cohen's kappa coefficients over 0.70). Using a variety of features, we found classifiers for each SOAP class can be created with moderate to outstanding performance with F"1 scores of 93.9 (subjective), 94.5 (objective), 75.7 (assessment), and 77.0 (plan). We look forward to expanding the framework and applying the SOAP classification to clinical information extraction tasks.