Named entity recognition in biomedical texts using an HMM model

Authors:
Shaojun Zhao
Affiliations:
University of Alberta, Edmonton, Canada
Venue:
JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Year:
2004

Citing 13
Cited 9

Explorations in Automatic Thesaurus Discovery

Explorations in Automatic Thesaurus Discovery
Automatic retrieval and clustering of similar words

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Distributional clustering of English words

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Noun classification from predicate-argument structures

ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
Measures of distributional similarity

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Frequency estimates for statistical word similarity measures

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Recognizing names in biomedical texts: a machine learning approach

Bioinformatics
Tuning support vector machines for biomedical named entity recognition

BioMed '02 Proceedings of the ACL-02 workshop on Natural language processing in the biomedical domain - Volume 3
Use of support vector machines in extended named entity recognition

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Two-phase biomedical NE recognition based on SVMs

BioMed '03 Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine - Volume 13
The GENIA corpus: an annotated research abstract corpus in molecular biology domain

HLT '02 Proceedings of the second international conference on Human Language Technology Research
Comparison between tagged corpora for the named entity task

CompareCorpora '00 Proceedings of the Workshop on Comparing Corpora
A nearest-neighbor method for resolving PP-Attachment ambiguity

IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing

Boosting performance of bio-entity recognition by combining results from multiple systems

Proceedings of the 5th international workshop on Bioinformatics
Introduction to the bio-entity recognition task at JNLPBA

JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Using conditional random fields for result identification in biomedical abstracts

Integrated Computer-Aided Engineering
Quantifying the impact of concept recognition on biomedical information retrieval

Information Processing and Management: an International Journal
A generic classifier-ensemble approach for biomedical named entity recognition

PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Technical term recognition with semi-supervised learning using hierarchical bayesian language models

NLDB'12 Proceedings of the 17th international conference on Applications of Natural Language Processing and Information Systems
Biomedical named entity recognition: a poor knowledge HMM-based approach

NLDB'07 Proceedings of the 12th international conference on Applications of Natural Language to Information Systems
Unsupervised biomedical named entity recognition: Experiments with clinical and biological texts

Journal of Biomedical Informatics
Towards a Protein-Protein Interaction information extraction system: Recognizing named entities

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Although there exists a huge number of biomedical texts online, there is a lack of tools good enough to help people get information or knowledge from them. Named entity Recognition (NER) becomes very important for further processing like information retrieval, information extraction and knowledge discovery. We introduce a Hidden Markov Model (HMM) for NER, with a word similarity-based smoothing. Our experiment shows that the word similarity-based smoothing can improve the performance by using huge unlabeled data. While many systems have laboriously hand-coded rules for all kinds of word features, we show that word similarity is a potential method to automatically get word formation, prefix, suffix and abbreviation information automatically from biomedical texts, as well as useful word distribution information.