An Algorithm that Learns What‘s in a Name
Machine Learning - Special issue on natural language learning
Automatic Extraction of Biological Information from Scientific Text: Protein-Protein Interactions
Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
Constructing Biological Knowledge Bases by Extracting Information from Text Sources
Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
Natural language question answering: the view from here
Natural Language Engineering
Using corpus-derived name lists for named entity recognition
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Extracting the names of genes and gene products with a hidden Markov model
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Transparent access to multiple bioinformatics information sources
IBM Systems Journal - Deep computing for the life sciences
Overview of results of the MUC-6 evaluation
MUC6 '95 Proceedings of the 6th conference on Message understanding
Comparison between tagged corpora for the named entity task
WCC '00 Proceedings of the workshop on Comparing corpora - Volume 9
GeneWays: a system for extracting, analyzing, visualizing, and integrating molecular pathway data
Journal of Biomedical Informatics
Gene name identification and normalization using a model organism database
Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Term identification in the biomedical literature
Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Gene name extraction using FlyBase resources
BioMed '03 Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine - Volume 13
International Journal of Bioinformatics Research and Applications
Rule-Based Protein Term Identification with Help from Automatic Species Tagging
CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
A preliminary look into the use of named entity information for bioscience text tokenization
HLT-SRWS '04 Proceedings of the Student Research Workshop at HLT-NAACL 2004
Unsupervised gene/protein named entity normalization using automatically extracted dictionaries
ISMB '05 Proceedings of the ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases: Mining Biological Semantics
Database Note: iProLINK: an integrated protein resource for literature mining
Computational Biology and Chemistry
Learning 5000 relational extractors
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Issues on quality assessment of SNOMED CT® subsets: term validation and term extraction
WBIE '09 Proceedings of the Workshop on Biomedical Information Extraction
Recognizing medication related entities in hospital discharge summaries using support vector machine
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Boosting performance of gene mention tagging system by hybrid methods
Journal of Biomedical Informatics
Combining information extraction and text mining for cancer biomarker detection
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Hi-index | 0.00 |
As the pace of biological research accelerates, biologists are becoming increasingly reliant on computers to manage the information explosion. Biologists communicate their research findings by relying on precise biological terms; these terms then provide indices into the literature and across the growing number of biological databases. This article examines emerging techniques to access biological resources through extraction of entity names and relations among them. Information extraction has been an active area of research in natural language processing and there are promising results for information extraction applied to news stories, e.g., balanced precision and recall in the 93-95% range for identifying person, organization and location names. But these results do not seem to transfer directly to biological names, where results remain in the 75-80% range. Multiple factors may be involved, including absence of shared training and test sets for rigorous measures of progress, lack of annotated training data specific to biological tasks, pervasive ambiguity of terms, frequent introduction of new terms, and a mismatch between evaluation tasks as defined for news and real biological problems. We present evidence from a simple lexical matching exercise that illustrates some specific problems encountered when identifying biological names. We conclude by outlining a research agenda to raise performance of named entity tagging to a level where it can be used to perform tasks of biological importance.