Elements of information theory
Elements of information theory
An Algorithm that Learns What‘s in a Name
Machine Learning - Special issue on natural language learning
A Tutorial on Support Vector Machines for Pattern Recognition
Data Mining and Knowledge Discovery
A statistical profile of the Named Entity task
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Automatically generating extraction patterns from untagged text
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
Cascaded classifiers for confidence-based chemical named entity recognition
BioNLP '08 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Annotation of chemical named entities
BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
Identifying, Indexing, and Ranking Chemical Formulae and Chemical Names in Digital Documents
ACM Transactions on Information Systems (TOIS)
High-Throughput identification of chemistry in life science texts
CompLife'06 Proceedings of the Second international conference on Computational Life Sciences
Hi-index | 0.00 |
We investigate various strategies for finding chemicals in biomedical text using substring co-occurrence information. The goal is to build a system from readily available data with minimal human involvement. Our models are trained from a dictionary of chemical names and general biomedical text. We investigated several strategies including Naïve Bayes classifiers and several types of N-gram models. We introduced a new way of interpolating N-grams that does not require tuning any parameters. We also found the task to be similar to Language Identification.