Pivoted document length normalization
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
SIGDOC '86 Proceedings of the 5th annual international conference on Systems documentation
Disambiguating ambiguous biomedical terms in biomedical narrative text: an unsupervised method
Computers and Biomedical Research
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Using Compression to Identify Acronyms in Text
DCC '00 Proceedings of the Conference on Data Compression
Maximum entropy models for natural language ambiguity resolution
Maximum entropy models for natural language ambiguity resolution
Automatic word sense discrimination
Computational Linguistics - Special issue on word sense disambiguation
A stochastic parts program and noun phrase parser for unrestricted text
ANLC '88 Proceedings of the second conference on Applied natural language processing
A maximum-entropy-inspired parser
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Unsupervised word sense disambiguation rivaling supervised methods
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
A new statistical parser based on bigram lexical dependencies
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Disambiguation of biomedical abbreviations
BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
A discriminative alignment model for abbreviation recognition
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Acronym extraction and disambiguation in large-scale organizational web pages
Proceedings of the 18th ACM conference on Information and knowledge management
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
An IR-Aided Machine Learning Framework for the BioCreative II.5 Challenge
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Disambiguation in the biomedical domain: The role of ambiguity type
Journal of Biomedical Informatics
Disambiguation of medline abstracts using topic models
Proceedings of the ACM fifth international workshop on Data and text mining in biomedical informatics
Learning Abbreviations from Chinese and English Terms by Modeling Non-Local Information
ACM Transactions on Asian Language Information Processing (TALIP)
Name disambiguation in scientific cooperation network by exploiting user feedback
Artificial Intelligence Review
Hi-index | 0.00 |
Abbreviations and acronyms are widely used in the biomedical literature and many of them represent important biomedical concepts. Because many abbreviations are ambiguous (e.g., CAT denotes both chloramphenicol acetyl transferase and computed axial tomography, depending on the context), recognizing the full form associated with each abbreviation is in most cases equivalent to identifying the meaning of the abbreviation. This, in turn, allows us to perform more accurate natural language processing, information extraction, and retrieval. In this study, we have developed supervised approaches to identifying the full forms of ambiguous abbreviations within the context they appear. We first automatically assigned multiple possible full forms for each abbreviation; we then treated the in-context full-form prediction for each specific abbreviation occurrence as a case of word-sense disambiguation. We generated automatically a dictionary of all possible full forms for each abbreviation. We applied supervised machine-learning algorithms for disambiguation. Because some of the links between abbreviations and their corresponding full forms are explicitly given in the text and can be recovered automatically, we can use these explicit links to automatically provide training data for disambiguating the abbreviations that are not linked to a full form within a text. We evaluated our methods on over 150 thousand abstracts and obtain for coverage and precision results of 82% and 92%, respectively, when performed as tenfold cross-validation, and 79% and 80%, respectively, when evaluated against an external set of abstracts in which the abbreviations are not defined.