Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Pivoted document length normalization
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Disambiguating ambiguous biomedical terms in biomedical narrative text: an unsupervised method
Computers and Biomedical Research
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Using Compression to Identify Acronyms in Text
DCC '00 Proceedings of the Conference on Data Compression
Maximum entropy models for natural language ambiguity resolution
Maximum entropy models for natural language ambiguity resolution
Automatically identifying gene/protein terms in MEDLINE abstracts
Journal of Biomedical Informatics
Unsupervised word sense disambiguation rivaling supervised methods
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
GeneWays: a system for extracting, analyzing, visualizing, and integrating molecular pathway data
Journal of Biomedical Informatics
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Medstract: creating large-scale information servers for biomedical libraries
BioMed '02 Proceedings of the ACL-02 workshop on Natural language processing in the biomedical domain - Volume 3
An IR-Aided Machine Learning Framework for the BioCreative II.5 Challenge
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Disambiguation in the biomedical domain: The role of ambiguity type
Journal of Biomedical Informatics
Alignment-HMM-based extraction of abbreviations from biomedical text
BioNLP '12 Proceedings of the 2012 Workshop on Biomedical Natural Language Processing
Hi-index | 0.00 |
Biomedical abbreviations and acronyms are widely used in biomedical literature. Since many of them represent important content in biomedical literature, information retrieval and extraction benefits from identifying the meanings of those terms. On the other hand, many abbreviations and acronyms are ambiguous, it would be important to map them to their full forms, which ultimately represent the meanings of the abbreviations. In this study, we present a semi-supervised method that applies MEDLINE as a knowledge source for disambiguating abbreviations and acronyms in full-text biomedical journal articles. We first automatically generated from the MEDLINE abstracts a dictionary of abbreviation-full pairs based on a rule-based system that maps abbreviations to full forms when full forms are defined in the abstracts. We then trained on the MEDLINE abstracts and predicted the full forms of abbreviations in full-text journal articles by applying supervised machine-learning algorithms in a semi-supervised fashion. We report up to 92% prediction precision and up to 91% coverage.