Disambiguation of ambiguous biomedical terms using examples generated from the UMLS Metathesaurus

Authors:
Mark Stevenson;Yikun Guo
Affiliations:
Natural Language Processing Group, Department of Computer Science, University of Sheffield, Regent Court, 211 Portobello, Sheffield S1 4DP, United Kingdom;Natural Language Processing Group, Department of Computer Science, University of Sheffield, Regent Court, 211 Portobello, Sheffield S1 4DP, United Kingdom
Venue:
Journal of Biomedical Informatics
Year:
2010

Citing 16
Cited 2

An interactive system for finding complementary literatures: a stimulus to scientific discovery

Artificial Intelligence - Special issue on scientific discovery
Introduction to the special issue on word sense disambiguation: the state of the art

Computational Linguistics - Special issue on word sense disambiguation
Using corpus statistics and WordNet relations for sense identification

Computational Linguistics - Special issue on word sense disambiguation
Automatic retrieval and clustering of similar words

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
A decision tree of bigrams is an accurate predictor of word sense

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Exploiting parallel texts for word sense disambiguation: an empirical study

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Word sense disambiguation by selecting the best semantic type based on Journal Descriptor Indexing: Preliminary experiment

Journal of the American Society for Information Science and Technology
Finding predominant word senses in untagged text

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Word sense disambiguation: A survey

ACM Computing Surveys (CSUR)
Word sense disambiguation across two domains: Biomedical literature and clinical notes

Journal of Biomedical Informatics
Inter-coder agreement for computational linguistics

Computational Linguistics
Knowledge sources for word sense disambiguation of biomedical text

BioNLP '08 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Disambiguation of biomedical abbreviations

BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Good neighbors make good senses: exploiting distributional similarity for unsupervised WSD

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Acquiring sense tagged examples using relevance feedback

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
SemEval-2007 task 17: English lexical sample, SRL and all words

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations

Disambiguation in the biomedical domain: The role of ambiguity type

Journal of Biomedical Informatics
Scaling up WSD with automatically generated examples

BioNLP '12 Proceedings of the 2012 Workshop on Biomedical Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Researchers have access to a vast amount of information stored in textual documents and there is a pressing need for the development of automated methods to enable and improve access to this resource. Lexical ambiguity, the phenomena in which a word or phrase has more than one possible meaning, presents a significant obstacle to automated text processing. Word Sense Disambiguation (WSD) is a technology that resolves these ambiguities automatically and is an important stage in text understanding. The most accurate approaches to WSD rely on manually labeled examples but this is usually not available and is prohibitively expensive to create. This paper offers a solution to that problem by using information in the UMLS Metathesaurus to automatically generate labeled examples. Two approaches are presented. The first is an extension of existing work (Liu et al., 2002 [1]) and the second a novel approach that exploits information in the UMLS that has not been used for this purpose. The automatically generated examples are evaluated by comparing them against the manually labeled ones in the NLM-WSD data set and are found to outperform the baseline. The examples generated using the novel approach produce an improvement in WSD performance when combined with manually labeled examples.