Gene name ambiguity of eukaryotic nomenclatures
Bioinformatics
Contrast and variability in gene names
BioMed '02 Proceedings of the ACL-02 workshop on Natural language processing in the biomedical domain - Volume 3
Text processing through Web services
Bioinformatics
Inter-species normalization of gene mentions with GNAT
Bioinformatics
High-performance gene name normalization with GeNo
Bioinformatics
Human gene name normalization using text matching with automatically extracted synonym dictionaries
BioNLP '06 Proceedings of the Workshop on Linking Natural Language Processing and Biology: Towards Deeper Biological Literature Analysis
Unsupervised gene/protein named entity normalization using automatically extracted dictionaries
ISMB '05 Proceedings of the ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases: Mining Biological Semantics
Hi-index | 0.00 |
The task of recognizing and normalizing protein name mentions in biomedical literature is a challenging task and important for text mining applications such as protein-protein interactions, pathway reconstruction and many more. In this paper, we present ProNormz, an integrated approach for human proteins (HPs) tagging and normalization. In Homo sapiens, a greater number of biological processes are regulated by a large human gene family called protein kinases by post translational phosphorylation. Recognition and normalization of human protein kinases (HPKs) is considered to be important for the extraction of the underlying information on its regulatory mechanism from biomedical literature. ProNormz distinguishes HPKs from other HPs besides tagging and normalization. To our knowledge, ProNormz is the first normalization system available to distinguish HPKs from other HPs in addition to gene normalization task. ProNormz incorporates a specialized synonyms dictionary for human proteins and protein kinases, a set of 15 string matching rules and a disambiguation module to achieve the normalization. Experimental results on benchmark BioCreative II training and test datasets show that our integrated approach achieve a fairly good performance and outperforms more sophisticated semantic similarity and disambiguation systems presented in BioCreative II GN task. As a freely available web tool, ProNormz is useful to developers as extensible gene normalization implementation, to researchers as a standard for comparing their innovative techniques, and to biologists for normalization and categorization of HPs and HPKs mentions in biomedical literature. URL: http://www.biominingbu.org/pronormz.