Brief Communication: Two-phase biomedical named entity recognition using CRFs
Computational Biology and Chemistry
A generic classifier-ensemble approach for biomedical named entity recognition
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Towards a Protein-Protein Interaction information extraction system: Recognizing named entities
Knowledge-Based Systems
Hi-index | 0.00 |
We propose a cascaded approach for extracting biomedical named entities from text documents using a unified model. Previous works often ignore the high computational cost incurred by a single-phase approach. We alleviate this problem by dividing the named entity extraction task into a segmentation task and a classification task, reducing the computational cost by an order of magnitude. A unified model, which we term "maximum-entropy margin-based" (MEMB), is used in both tasks. The MEMB model considers the error between a correct and an incorrect output during training and helps improve the performance of extracting sparse entity types that occur in biomedical literature. We report experimental evaluations on the GENIA corpus available from the BioNLP/NLPBA (2004) shared task, which demonstrate the state-of-the-art performance achieved by the proposed approach.