High-performance gene name normalization with GeNo

Authors:
Joachim Wermter;Katrin Tomanek;Udo Hahn
Affiliations:
-;-;-
Venue:
Bioinformatics
Year:
2009

Citing 0
Cited 10

Event extraction from trimmed dependency graphs

BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing: Shared Task
How feasible and robust is the automatic extraction of gene regulation events?: a cross-method evaluation under lab and real-life conditions

BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Mining the relationship between gene and disease from literature

FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 7
Efficient Extraction of Protein-Protein Interactions from Full-Text Articles

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Combining multiple disambiguation methods for gene mention normalization

Expert Systems with Applications: An International Journal
EVEX: a pubmed-scale resource for homology-based generalization of text mining predictions

BioNLP '11 Proceedings of BioNLP 2011 Workshop
Towards automatic pathway generation from biological full-text publications

IDA'11 Proceedings of the 10th international conference on Advances in intelligent data analysis X
Overview of the entity relations (REL) supporting task of BioNLP Shared Task 2011

BioNLP Shared Task '11 Proceedings of the BioNLP Shared Task 2011 Workshop
Unsupervised corpus distillation for represented indicator measurement on focus species detection

International Journal of Data Mining and Bioinformatics
ProNormz - An integrated approach for human proteins and protein kinases normalization

Journal of Biomedical Informatics

Quantified Score

Hi-index	3.84

Visualization

Abstract

Motivation: The recognition and normalization of textual mentions of gene and protein names is both particularly important and challenging. Its importance lies in the fact that they constitute the crucial conceptual entities in biomedicine. Their recognition and normalization remains a challenging task because of widespread gene name ambiguities within species, across species, with common English words and with medical sublanguage terms. Results: We present GeNo, a highly competitive system for gene name normalization, which obtains an F-measure performance of 86.4% (precision: 87.8%, recall: 85.0%) on the BioCreAtIvE-II test set, thus being on a par with the best system on that task. Our system tackles the complex gene normalization problem by employing a carefully crafted suite of symbolic and statistical methods, and by fully relying on publicly available software and data resources, including extensive background knowledge based on semantic profiling. A major goal of our work is to present GeNo's architecture in a lucid and perspicuous way to pave the way to full reproducibility of our results. Availability: GeNo, including its underlying resources, will be available from www.julielab.de. It is also currently deployed in the Semedico search engine at www.semedico.org. Contact: joachim.wermter@uni-jena.de