Word sense disambiguation and information retrieval
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Assessing agreement on classification tasks: the kappa statistic
Computational Linguistics
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
Disambiguating ambiguous biomedical terms in biomedical narrative text: an unsupervised method
Computers and Biomedical Research
Introduction to the special issue on word sense disambiguation: the state of the art
Computational Linguistics - Special issue on word sense disambiguation
Estimating upper and lower bounds on the performance of word-sense disambiguation programs
ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics
Information problems in molecular biology and bioinformatics: Research Articles
Journal of the American Society for Information Science and Technology - Bioinformatics
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
A decision tree of bigrams is an accurate predictor of word sense
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Gene name ambiguity of eukaryotic nomenclatures
Bioinformatics
Resolving abbreviations to their senses in Medline
Bioinformatics
Journal of the American Society for Information Science and Technology
Medstract: creating large-scale information servers for biomedical libraries
BioMed '02 Proceedings of the ACL-02 workshop on Natural language processing in the biomedical domain - Volume 3
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
A large scale, corpus-based approach for automatically disambiguating biomedical abbreviations
ACM Transactions on Information Systems (TOIS)
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Finding predominant word senses in untagged text
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Journal of Biomedical Informatics
Gene symbol disambiguation using knowledge-based profiles
Bioinformatics
Word sense disambiguation: A survey
ACM Computing Surveys (CSUR)
Word sense disambiguation across two domains: Biomedical literature and clinical notes
Journal of Biomedical Informatics
Inter-coder agreement for computational linguistics
Computational Linguistics
An unsupervised vector approach to biomedical term disambiguation: integrating UMLS and Medline
HLT-SRWS '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Student Research Workshop
Species disambiguation for biomedical term identification
BioNLP '08 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Estimating and exploiting the entropy of sense distributions
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Proceedings of the 4th International Workshop on Semantic Evaluations
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Disambiguation of ambiguous biomedical terms using examples generated from the UMLS Metathesaurus
Journal of Biomedical Informatics
The effect of ambiguity on the automated acquisition of WSD examples
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Hi-index | 0.00 |
Word Sense Disambiguation (WSD), the automatic identification of the meanings of ambiguous terms in a document, is an important stage in text processing. We describe a WSD system that has been developed specifically for the types of ambiguities found in biomedical documents. This system uses a range of knowledge sources. It employs both linguistic features, such as local collocations, and features derived from domain-specific knowledge sources, the Unified Medical Language System (UMLS) and Medical Subject Headings (MeSH). This system is applied to three types of ambiguities found in Medline abstracts: ambiguous terms, abbreviations with multiple expansions and names that are ambiguous between genes. The WSD system is applied to the standard NLM-WSD data set, which consists of ambiguous terms from Medline abstracts, and was found to perform well in comparison with previously reported results. The system's performance and the contribution of each knowledge source depends upon the type of lexical ambiguity. 87.9% of the ambiguous terms are correctly disambiguated using a combination of linguistic features and MeSH terms, 99% of abbreviations are disambiguated by combining all knowledge sources, while 97.2% of ambiguous gene names are disambiguated using the MeSH terms alone. Analysis reveals that these differences are caused by the nature of each ambiguity type. These results should be taken into account when deciding which information to use for WSD and the level of performance that can be expected.