TnT: a statistical part-of-speech tagger
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Two-phase biomedical NE recognition based on SVMs
BioMed '03 Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine - Volume 13
Boosting performance of bio-entity recognition by combining results from multiple systems
Proceedings of the 5th international workshop on Bioinformatics
How to make the most of NE dictionaries in statistical NER
BioNLP '08 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Scalable biomedical Named Entity Recognition: investigation of a database-supported SVM approach
International Journal of Bioinformatics Research and Applications
Two-phase biomedical named entity recognition using a hybrid method
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Technical term recognition with semi-supervised learning using hierarchical bayesian language models
NLDB'12 Proceedings of the 17th international conference on Applications of Natural Language Processing and Information Systems
Towards a Protein-Protein Interaction information extraction system: Recognizing named entities
Knowledge-Based Systems
Hi-index | 0.00 |
In this paper, we propose a two-phase biomedical named entity(NE) recognition method based on SVMs. We first recognize biomedical terms, and then assign appropriate semantic classes to the recognized terms. In the two-phase NE recognition method, the performance of term recognition is critical to the overall performance of the system because term recognition errors can be propagated to the semantic classification phase. In this study, we try to improve the performance of term recognition by using lexical knowledge. We utilize salient NPs and collocations as lexical knowledge extracted from raw corpus. In addition, we use morphological knowledge extracted from training data as features for learning SVM classifiers. Experimental results show that our system obtains an F-measure of 62.97% on the test data, and that the performance can be improved upto 2.82% by using lexical knowledge.