UMLS content views appropriate for NLP processing of the biomedical literature vs. clinical text

Authors:
Dina Demner-Fushman;James G. Mork;Sonya E. Shooshan;Alan R. Aronson
Affiliations:
Lister Hill National Center for Biomedical Communications (LHNCBC), U.S. National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA;Lister Hill National Center for Biomedical Communications (LHNCBC), U.S. National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA;Lister Hill National Center for Biomedical Communications (LHNCBC), U.S. National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA;Lister Hill National Center for Biomedical Communications (LHNCBC), U.S. National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
Venue:
Journal of Biomedical Informatics
Year:
2010

Citing 4
Cited 0

Towards a semantic lexicon for biological language processing: Conference Papers

Comparative and Functional Genomics
Word sense disambiguation by selecting the best semantic type based on Journal Descriptor Indexing: Preliminary experiment

Journal of the American Society for Information Science and Technology
Answering Clinical Questions with Knowledge-Based and Statistical Techniques

Computational Linguistics
Word sense disambiguation across two domains: Biomedical literature and clinical notes

Journal of Biomedical Informatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Identification of medical terms in free text is a first step in such Natural Language Processing (NLP) tasks as automatic indexing of biomedical literature and extraction of patients' problem lists from the text of clinical notes. Many tools developed to perform these tasks use biomedical knowledge encoded in the Unified Medical Language System (UMLS) Metathesaurus. We continue our exploration of automatic approaches to creation of subsets (UMLS content views) which can support NLP processing of either the biomedical literature or clinical text. We found that suppression of highly ambiguous terms in the conservative AutoFilter content view can partially replace manual filtering for literature applications, and suppression of two character mappings in the same content view achieves 89.5% precision at 78.6% recall for clinical applications.