Section classification in clinical notes using supervised hidden markov model

Authors:
Ying Li;Sharon Lipsky Gorman;Noémie Elhadad
Affiliations:
Columbia University, New York, NY, USA;Columbia University, New York, NY, USA;Columbia University, New York, NY, USA
Venue:
Proceedings of the 1st ACM International Health Informatics Symposium
Year:
2010

Citing 12
Cited 3

Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Summarizing scientific articles: experiments with relevance and rhetorical status

Computational Linguistics - Summarization
Mining complex clinical data for patient safety research: a framework for event discovery

Journal of Biomedical Informatics - Patient safety
An annotation scheme for discourse-level argumentation in research articles

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
A baseline feature set for learning rhetorical zones using full articles in the biomedical domain

ACM SIGKDD Explorations Newsletter - Natural language processing and text mining
Natural language processing to extract medical problems from electronic clinical documents: Performance evaluation

Journal of Biomedical Informatics
Zone identification in biology articles as a basis for information extraction

JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Generative content models for structural analysis of medical abstracts

BioNLP '06 Proceedings of the Workshop on Linking Natural Language Processing and Biology: Towards Deeper Biological Literature Analysis
Methodological Review: What can natural language processing do for clinical decision support?

Journal of Biomedical Informatics
Towards discipline-independent argumentative zoning: evidence from chemistry and computational linguistics

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Accurate argumentative zoning with maximum entropy models

NLPIR4DL '09 Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries
Selecting information in electronic health records for knowledge acquisition

Journal of Biomedical Informatics

Exploiting semantic structure for mapping user-specified form terms to SNOMED CT concepts

Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
A hybrid knowledge-based and data-driven approach to identifying semantically similar concepts

Journal of Biomedical Informatics
Temporal classification of medical events

BioNLP '12 Proceedings of the 2012 Workshop on Biomedical Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

As more and more information is available in the Electronic Health Record in the form of free-text narrative, there is a need for automated tools, which can process and understand such texts. One first step towards the automated processing of clinical texts is to determine the document-level structure of a patient note, i.e., identifying the different sections and mapping them to known section types automatically. This paper considers section mapping as a sequence-labeling problem to 15 possible known section types. Our method relies on a Hidden Markov Model (HMM) trained on a corpus of 9,679 clinical notes from NewYork-Presbyterian Hospital. We compare our method to a state-of-the-art baseline, which ignores the sequential aspect of the sections and considers each section independently of the others in a note. Experiments show that our method outperforms the baseline significantly, yielding 93% accuracy in identifying sections individually and 70% accuracy in identifying all the sections in a note, compared to 70% and 19% for the baseline method respectively.