Identifying Gene and Protein Names from Biological Texts
CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
A shallow parser based on closed-class words to capture relations in biomedical text
Journal of Biomedical Informatics
Gene name identification and normalization using a model organism database
Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Term identification in the biomedical literature
Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Gene name extraction using FlyBase resources
BioMed '03 Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine - Volume 13
Extracting contrastive information from negation patterns in biomedical literature
ACM Transactions on Asian Language Information Processing (TALIP)
Gene ontology annotation as text categorization: An empirical study
Information Processing and Management: an International Journal
Human gene name normalization using text matching with automatically extracted synonym dictionaries
BioNLP '06 Proceedings of the Workshop on Linking Natural Language Processing and Biology: Towards Deeper Biological Literature Analysis
Adaptive string similarity metrics for biomedical reference resolution
ISMB '05 Proceedings of the ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases: Mining Biological Semantics
Human gene name normalization using text matching with automatically extracted synonym dictionaries
LNLBioNLP '06 Proceedings of the HLT-NAACL BioNLP Workshop on Linking Natural Language and Biology
Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology
A joint model for normalizing gene and organism mentions in text
WBIE '09 Proceedings of the Workshop on Biomedical Information Extraction
Brief communication: Hidden Markov models and optimized sequence alignments
Computational Biology and Chemistry
Identification of related gene/protein names based on an HMM of name variations
Computational Biology and Chemistry
ProNormz - An integrated approach for human proteins and protein kinases normalization
Journal of Biomedical Informatics
Hi-index | 0.00 |
We studied contrast and variability in a corpus of gene names to identify potential heuristics for use in performing entity identification in the molecular biology domain. Based on our findings, we developed heuristics for mapping weakly matching gene names to their official gene names. We then tested these heuristics against a large body of Medline abstracts, and found that using these heuristics can increase recall, with varying levels of precision. Our findings also underscored the importance of good information retrieval and of the ability to disambiguate between genes, proteins, RNA, and a variety of other referents for performing entity identification with high precision.