A bioinformatics analysis of the cell line nomenclature
Bioinformatics
TX task: automatic detection of focus organisms in biomedical publications
BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
What's in a gene name?: automated refinement of gene name dictionaries
BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Relation mining experiments in the pharmacogenomics domain
Journal of Biomedical Informatics
Hi-index | 0.00 |
We present an approach towards the automatic detection of names of proteins, genes, species, etc. in biomedical literature and their grounding to widely accepted identifiers. The annotation is based on a large term list that contains the common expression of the terms, a normalization step that matches the terms with their actual representation in the texts, and a disambiguation step that resolves the ambiguity of matched terms. We describe various characteristics of the terms found in existing term resources and of the terms that are used in biomedical texts. We evaluate our results against a corpus of manually annotated protein mentions and achieve a precision of 57% and recall of 72%.