Support vector machine active learning with applications to text classification
The Journal of Machine Learning Research
Efficient support vector classifiers for named entity recognition
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
HLT '91 Proceedings of the workshop on Speech and Natural Language
Tuning support vector machines for biomedical named entity recognition
BioMed '02 Proceedings of the ACL-02 workshop on Natural language processing in the biomedical domain - Volume 3
LIBLINEAR: A Library for Large Linear Classification
The Journal of Machine Learning Research
Web-scale named entity recognition
Proceedings of the 17th ACM conference on Information and knowledge management
Biomedical named entity recognition using conditional random fields and rich feature sets
JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Bootstrapping and evaluating named entity recognition in the biomedical domain
BioNLP '06 Proceedings of the Workshop on Linking Natural Language Processing and Biology: Towards Deeper Biological Literature Analysis
Incorporating GENETAG-style annotation to GENIA corpus
BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Sparse Online Learning via Truncated Gradient
The Journal of Machine Learning Research
Design challenges and misconceptions in named entity recognition
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Inducing domain-specific semantic class taggers from (almost) nothing
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Adapting self-training for semantic role labeling
ACLstudent '10 Proceedings of the ACL 2010 Student Research Workshop
Developing a robust part-of-speech tagger for biomedical text
PCI'05 Proceedings of the 10th Panhellenic conference on Advances in Informatics
Hi-index | 0.00 |
Named Entity Recognition (NER) is an important first step for BioNLP tasks, e.g., gene normalization and event extraction. Employing supervised machine learning techniques for achieving high performance recent NER systems require a manually annotated corpus in which every mention of the desired semantic types in a text is annotated. However, great amounts of human effort is necessary to build and maintain an annotated corpus. This study explores a method to build a high-performance NER without a manually annotated corpus, but using a comprehensible lexical database that stores numerous expressions of semantic types and with huge amount of unannotated texts. We underscore the effectiveness of our approach by comparing the performance of NERs trained on an automatically acquired training data and on a manually annotated corpus.