Recognizing names in biomedical texts using hidden Markov model and SVM plus sigmoid

Authors:
Zhou GuoDong
Affiliations:
Institute for Infocomm Research, Singapore
Venue:
JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Year:
2004

Citing 8
Cited 1

The nature of statistical learning theory

The nature of statistical learning theory
An empirical study of smoothing techniques for language modeling

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Named entity recognition using an HMM-based chunk tagger

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Tuning support vector machines for biomedical named entity recognition

BioMed '02 Proceedings of the ACL-02 workshop on Natural language processing in the biomedical domain - Volume 3
Two-phase biomedical NE recognition based on SVMs

BioMed '03 Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine - Volume 13
Boosting precision and recall of dictionary-based protein name recognition

BioMed '03 Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine - Volume 13
Effective adaptation of a Hidden Markov Model-based named entity recognizer for biomedical domain

BioMed '03 Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine - Volume 13
The GENIA corpus: an annotated research abstract corpus in molecular biology domain

HLT '02 Proceedings of the second international conference on Human Language Technology Research

Scalable biomedical Named Entity Recognition: investigation of a database-supported SVM approach

International Journal of Bioinformatics Research and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present a named entity recognition system in the biomedical domain, called PowerBioNE. In order to deal with the special phenomena in the biomedical domain, various evidential features are proposed and integrated through a Hidden Markov Model (HMM). In addition, a Support Vector Machine (SVM) plus sigmoid is proposed to resolve the data sparseness problem in our system. Finally, we present two post-processing modules to deal with the cascaded entity name and abbreviation phenomena. Evaluation shows that our system achieves the F-measure of 69.1 and 71.2 on the 23 classes of GENIA V1.1 and V3.0 respectively. In particular, our system achieves the F-measure of 77.8 on the "protein" class of GENIA V3.0. It shows that our system outperforms the best published system on GENIA V1.1 and V3.0.