Two-phase biomedical NE recognition based on SVMs

Authors:
Ki-Joong Lee;Young-Sook Hwang;Hae-Chang Rim
Affiliations:
Korea University, Anam-dong, SEOUL, Korea;Korea University, Anam-dong, SEOUL, Korea;Korea University, Anam-dong, SEOUL, Korea
Venue:
BioMed '03 Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine - Volume 13
Year:
2003

Citing 4
Cited 33

Pairwise classification and support vector machines

Advances in kernel methods
Estimating the Generalization Performance of an SVM Efficiently

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Extracting the names of genes and gene products with a hidden Markov model

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Tuning support vector machines for biomedical named entity recognition

BioMed '02 Proceedings of the ACL-02 workshop on Natural language processing in the biomedical domain - Volume 3

Comparison of character-level and part of speech features for name recognition in biomedical texts

Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Improving the performance of dictionary-based approaches in protein name recognition

Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Using automatically learnt verb selectional preferences for classification of biomedical terms

Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Using name-internal and contextual features to classify biological terms

Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Term identification in the biomedical literature

Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
ME-based biomedical named entity recognition using lexical knowledge

ACM Transactions on Asian Language Information Processing (TALIP)
Multi-criteria-based active learning for named entity recognition

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
A hybrid approach to biomedical named entity recognition and semantic role labeling

NAACL-DocConsortium '06 Proceedings of the 2006 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume: doctoral consortium
Named entity recognition in Vietnamese using classifier voting

ACM Transactions on Asian Language Information Processing (TALIP)
Exploiting the contextual cues for bio-entity name recognition in biomedical literature

Journal of Biomedical Informatics
BioPPIExtractor: A protein-protein interaction extraction system for biomedical literature

Expert Systems with Applications: An International Journal
Recognizing names in biomedical texts using hidden Markov model and SVM plus sigmoid

JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Incorporating lexical knowledge into biomedical NE recognition

JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Annotating multiple types of biomedical entities: a single word classification approach

JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Named entity recognition in biomedical texts using an HMM model

JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
POSBIOTM-NER in the shared task of BioNLP/NLPBA 2004

JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Recognizing nested named entities in GENIA corpus

BioNLP '06 Proceedings of the Workshop on Linking Natural Language Processing and Biology: Towards Deeper Biological Literature Analysis
How to make the most of NE dictionaries in statistical NER

BioNLP '08 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
An unsupervised method for extracting domain-specific affixes in biological literature

BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
Brief Communication: Two-phase biomedical named entity recognition using CRFs

Computational Biology and Chemistry
Annotation and disambiguation of semantic types in biomedical text: a cascaded approach to named entity recognition

NLPXML '06 Proceedings of the 5th Workshop on NLP and XML: Multi-Dimensional Markup in Natural Language Processing
Recognizing nested named entities in GENIA corpus

LNLBioNLP '06 Proceedings of the HLT-NAACL BioNLP Workshop on Linking Natural Language and Biology
Recognizing biomedical named entities in Chinese research abstracts

Canadian AI'08 Proceedings of the Canadian Society for computational studies of intelligence, 21st conference on Advances in artificial intelligence
Cascading classifiers for named entity recognition in clinical notes

WBIE '09 Proceedings of the Workshop on Biomedical Information Extraction
Protein interaction detection in sentences via Gaussian Processes: a preliminary evaluation

International Journal of Data Mining and Bioinformatics
Generating links to background knowledge: a case study using narrative radiology reports

Proceedings of the 20th ACM international conference on Information and knowledge management
Various features with integrated strategies for protein name classification

ISPA'05 Proceedings of the 2005 international conference on Parallel and Distributed Processing and Applications
A greek named-entity recognizer that uses support vector machines and active learning

SETN'06 Proceedings of the 4th Helenic conference on Advances in Artificial Intelligence
Unsupervised event extraction from biomedical literature using co-occurrence information and basic patterns

IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
SVM-Based biological named entity recognition using minimum edit-distance feature boosted by virtual examples

IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Empirical textual mining to protein entities recognition from pubmed corpus

NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
BioPubMiner: machine learning component-based biomedical information analysis platform

CIT'04 Proceedings of the 7th international conference on Intelligent Information Technology
Exploring predicate-argument relations for named entity recognition in the molecular biology domain

DS'05 Proceedings of the 8th international conference on Discovery Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

Using SVMs for named entity recognition, we are often confronted with the multi-class problem. Larger as the number of classes is, more severe the multi-class problem is. Especially, one-vs-rest method is apt to drop the performance by generating severe unbalanced class distribution. In this study, to tackle the problem, we take a two-phase named entity recognition method based on SVMs and dictionary; at the first phase, we try to identify each entity by a SVM classifier and post-process the identified entities by a simple dictionary look-up; at the second phase, we try to classify the semantic class of the identified entity by SVMs. By dividing the task into two subtasks, i.e. the entity identification and the semantic classification, the unbalanced class distribution problem can be alleviated. Furthermore, we can select the features relevant to each task and take an alternative classification method according to the task. The experimental results on the GENIA corpus show that the proposed method is effective not only in the reduction of training cost but also in performance improvement: the identification performance is about 79.9(Fβ = 1), the semantic classification accuracy is about 66.5(Fβ = 1).