Rutabaga by any other name: extracting biological names
Journal of Biomedical Informatics - Special issue: Sublanguage
A simple rule-based part of speech tagger
ANLC '92 Proceedings of the third conference on Applied natural language processing
Comparison of character-level and part of speech features for name recognition in biomedical texts
Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Biomedical named entity recognition using two-phase model based on SVMs
Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Improving the performance of dictionary-based approaches in protein name recognition
Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
GAPSCORE: finding gene and protein names one word at a time
Bioinformatics
Gene name ambiguity of eukaryotic nomenclatures
Bioinformatics
Database Note: iProLINK: an integrated protein resource for literature mining
Computational Biology and Chemistry
Computational Biology and Chemistry
Exploiting the contextual cues for bio-entity name recognition in biomedical literature
Journal of Biomedical Informatics
Named entity normalization in user generated content
Proceedings of the second workshop on Analytics for noisy unstructured text data
Gene ontology annotation as text categorization: An empirical study
Information Processing and Management: an International Journal
Human gene name normalization using text matching with automatically extracted synonym dictionaries
BioNLP '06 Proceedings of the Workshop on Linking Natural Language Processing and Biology: Towards Deeper Biological Literature Analysis
BioNLP '06 Proceedings of the Workshop on Linking Natural Language Processing and Biology: Towards Deeper Biological Literature Analysis
BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Human gene name normalization using text matching with automatically extracted synonym dictionaries
LNLBioNLP '06 Proceedings of the HLT-NAACL BioNLP Workshop on Linking Natural Language and Biology
LNLBioNLP '06 Proceedings of the HLT-NAACL BioNLP Workshop on Linking Natural Language and Biology
PAKDD'07 Proceedings of the 2007 international conference on Emerging technologies in knowledge discovery and data mining
The impact of named entity normalization on information retrieval for question answering
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Joint inference of named entity recognition and normalization for tweets
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
ProNormz - An integrated approach for human proteins and protein kinases normalization
Journal of Biomedical Informatics
Hi-index | 0.00 |
Gene and protein named-entity recognition (NER) and normalization is often treated as a two-step process. While the first step, NER, has received considerable attention over the last few years, normalization has received much less attention. We have built a dictionary based gene and protein NER and normalization system that requires no supervised training and no human intervention to build the dictionaries from online genomics resources. We have tested our system on the Genia corpus and the BioCreative Task 1B mouse and yeast corpora and achieved a level of performance comparable to state-of-the-art systems that require supervised learning and manual dictionary creation. Our technique should also work for organisms following similar naming conventions as mouse, such as human. Further evaluation and improvement of gene/protein NER and normalization systems is somewhat hampered by the lack of larger test collections and collections for additional organisms, such as human.