Analyzing the effectiveness and applicability of co-training
Proceedings of the ninth international conference on Information and knowledge management
A fast string searching algorithm
Communications of the ACM
A guided tour to approximate string matching
ACM Computing Surveys (CSUR)
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A simple approach to building ensembles of Naive Bayesian classifiers for word sense disambiguation
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Tuning support vector machines for biomedical named entity recognition
BioMed '02 Proceedings of the ACL-02 workshop on Natural language processing in the biomedical domain - Volume 3
Use of support vector machines in extended named entity recognition
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
The GENIA corpus: an annotated research abstract corpus in molecular biology domain
HLT '02 Proceedings of the second international conference on Human Language Technology Research
A text-mining system for knowledge discovery from biomedical documents
IBM Systems Journal
Enhancing performance of protein and gene name recognizers with filtering and integration strategies
Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Use of morphological analysis in protein name recognition
Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Term identification in the biomedical literature
Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Computational Biology and Chemistry
Exploiting the contextual cues for bio-entity name recognition in biomedical literature
Journal of Biomedical Informatics
Recognizing names in biomedical texts using hidden Markov model and SVM plus sigmoid
JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Annotating multiple types of biomedical entities: a single word classification approach
JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Two learning approaches for protein name extraction
Journal of Biomedical Informatics
Nested named entity recognition
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
SimSem: fast approximate string matching in relation to semantic category disambiguation
BioNLP '11 Proceedings of BioNLP 2011 Workshop
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Empirical textual mining to protein entities recognition from pubmed corpus
NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
Active learning technique for biomedical named entity extraction
Proceedings of the International Conference on Advances in Computing, Communications and Informatics
BoDBES: a boosted dictionary-based biomedical entity spotter
Proceedings of the 7th international workshop on Data and text mining in biomedical informatics
Hi-index | 0.00 |
Dictionary-based protein name recognition is the first step for practical information extraction from biomedical documents because it provides ID information of recognized terms unlike machine learning based approaches. However, dictionary based approaches have two serious problems: (1) a large number of false recognitions mainly caused by short names. (2) low recall due to spelling variation. In this paper, we tackle the former problem by using a machine learning method to filter out false positives. We also present an approximate string searching method to alleviate the latter problem. Experimental results using the GE-NIA corpus show that the filtering using a naive Bayes classifier greatly improves precision with slight loss of recall, resulting in a much better F-score.