Communications of the ACM
Information extraction from biomedical literature: methodology, evaluation and an application
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Unsupervised word sense disambiguation rivaling supervised methods
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Extracting the names of genes and gene products with a hidden Markov model
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
HLT '91 Proceedings of the workshop on Speech and Natural Language
Identification of probable real words: an entropy-based approach
ULA '02 Proceedings of the ACL-02 workshop on Unsupervised lexical acquisition - Volume 9
@Note: A workbench for Biomedical Text Mining
Journal of Biomedical Informatics
BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Various features with integrated strategies for protein name classification
ISPA'05 Proceedings of the 2005 international conference on Parallel and Distributed Processing and Applications
Semantic annotation of biomedical literature using google
ICCSA'05 Proceedings of the 2005 international conference on Computational Science and Its Applications - Volume Part III
Hi-index | 0.00 |
Journals and conference proceedings represent the dominant mechanisms for reporting new biomedical results. The unstructured nature of such publications makes it difficult to utilize data mining or automated knowledge discovery techniques. Annotation (or markup) of these unstructured documents represents the first step in making these documents machine-analyzable. Often, however, the use of similar (or the same) labels for different entities and the use of different labels for the same entity makes entity extraction difficult in biomedical literature, In this paper we present a system called BioAnnotator for identifying and classifying biological terms in documents. BioAnnotator uses domain-based dictionary lookup for recognizing known terms and a rule engine for discovering new terms. We explain how the system uses a biomedical dictionary to learn extraction patterns for the rule engine and how it disambiguates biological terms that belong to multiple semantic classes.