Tuning support vector machines for biomedical named entity recognition

Authors:
Jun'ichi Kazama;Takaki Makino;Yoshihiro Ohta;Jun'ichi Tsujii
Affiliations:
University of Tokyo, Bunkyo-ku, Tokyo, Japan;University of Tokyo, Bunkyo-ku, Tokyo, Japan;Hitachi, Ltd., Kokubunji, Tokyo, Japan;University of Tokyo, Bunkyo-ku, Tokyo, Japan
Venue:
BioMed '02 Proceedings of the ACL-02 workshop on Natural language processing in the biomedical domain - Volume 3
Year:
2002

Citing 14
Cited 63

The nature of statistical learning theory

The nature of statistical learning theory
Support-Vector Networks

Machine Learning
A maximum entropy approach to natural language processing

Computational Linguistics
Making large-scale support vector machine learning practical

Advances in kernel methods
Fast training of support vector machines using sequential minimal optimization

Advances in kernel methods
Pairwise classification and support vector machines

Advances in kernel methods
A Pragmatic Information Extraction Strategy for Gathering Data on Genetic Interactions

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Automatic Construction of Knowledge Base from Biological Papers

Proceedings of the 5th International Conference on Intelligent Systems for Molecular Biology
Extracting the names of genes and gene products with a hidden Markov model

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Named entity extraction based on a maximum entropy model and transformation rules

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Use of support vector learning for chunk identification

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Comparison between tagged corpora for the named entity task

WCC '00 Proceedings of the workshop on Comparing corpora - Volume 9
The GENIA corpus: an annotated research abstract corpus in molecular biology domain

HLT '02 Proceedings of the second international conference on Human Language Technology Research
A comparison of methods for multiclass support vector machines

IEEE Transactions on Neural Networks

Probabilistic term variant generator for biomedical terms

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
A Probabilistic Model for Identifying Protein Names and their Name Boundaries

CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
A shallow parser based on closed-class words to capture relations in biomedical text

Journal of Biomedical Informatics
GeneWays: a system for extracting, analyzing, visualizing, and integrating molecular pathway data

Journal of Biomedical Informatics
Gene name identification and normalization using a model organism database

Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Enhancing HMM-based biomedical named entity recognition by studying special phenomena

Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Comparison of character-level and part of speech features for name recognition in biomedical texts

Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Biomedical named entity recognition using two-phase model based on SVMs

Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Improving the performance of dictionary-based approaches in protein name recognition

Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Using automatically learnt verb selectional preferences for classification of biomedical terms

Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Using name-internal and contextual features to classify biological terms

Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Term identification in the biomedical literature

Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
A hybrid approach to protein name identification in biomedical texts

Information Processing and Management: an International Journal
Two-phase biomedical NE recognition based on SVMs

BioMed '03 Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine - Volume 13
Boosting precision and recall of dictionary-based protein name recognition

BioMed '03 Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine - Volume 13
Effective adaptation of a Hidden Markov Model-based named entity recognizer for biomedical domain

BioMed '03 Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine - Volume 13
Protein name tagging for biomedical annotation in text

BioMed '03 Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine - Volume 13
An investigation of various information sources for classifying biological names

BioMed '03 Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine - Volume 13
Selecting text features for gene name classification: from documents to terms

BioMed '03 Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine - Volume 13
Mining semantically related terms from biomedical literature

ACM Transactions on Asian Language Information Processing (TALIP)
Multi-criteria-based active learning for named entity recognition

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Biological relation extraction and query answering from MEDLINE abstracts using ontology-based text mining

Data & Knowledge Engineering
Rich features based Conditional Random Fields for biological named entities recognition

Computers in Biology and Medicine
A Grid-Based Pseudo-Cache solution for MISD biomedical problems with high confidentiality and efficiency

International Journal of Bioinformatics Research and Applications
Country wise classification of human names

AIKED'06 Proceedings of the 5th WSEAS International Conference on Artificial Intelligence, Knowledge Engineering and Data Bases
Experimental Study on a Two Phase Method for Biomedical Named Entity Recognition

IEICE - Transactions on Information and Systems
Unsupervised Learning of Semantic Relations for Molecular Biology Ontologies

Proceedings of the 2008 conference on Ontology Learning and Population: Bridging the Gap between Text and Knowledge
Recognizing names in biomedical texts using hidden Markov model and SVM plus sigmoid

JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Annotating multiple types of biomedical entities: a single word classification approach

JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Named entity recognition in biomedical texts using an HMM model

JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
How to make the most of NE dictionaries in statistical NER

BioNLP '08 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
BioNoculars: extracting protein-protein interactions from biomedical text

BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
Broad-coverage sense disambiguation and information extraction with a supersense sequence tagger

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
A preliminary look into the use of named entity information for bioscience text tokenization

HLT-SRWS '04 Proceedings of the Student Research Workshop at HLT-NAACL 2004
Feature selection techniques for maximum entropy based biomedical named entity recognition

Journal of Biomedical Informatics
Unsupervised learning of semantic relations between concepts of a molecular biology ontology

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Two learning approaches for protein name extraction

Journal of Biomedical Informatics
Using conditional random fields for result identification in biomedical abstracts

Integrated Computer-Aided Engineering
Classifier subset selection for biomedical named entity recognition

Applied Intelligence
Nested named entity recognition

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
CRF-based active learning for Chinese named entity recognition

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Automatic extraction of kinetic information from biochemical literatures

FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 5
A composite kernel for named entity recognition

Pattern Recognition Letters
Recognizing biomedical named entities using skip-chain conditional random fields

BioNLP '10 Proceedings of the 2010 Workshop on Biomedical Natural Language Processing
Recognizing medication related entities in hospital discharge summaries using support vector machine

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Methodological Review: Natural Language Processing methods and systems for biomedical ontology learning

Journal of Biomedical Informatics
Biomedical concept extraction based on combining the content-based and word order similarities

Proceedings of the 2011 ACM Symposium on Applied Computing
Not all links are equal: exploiting dependency types for the extraction of protein-protein interactions from text

BioNLP '11 Proceedings of BioNLP 2011 Workshop
Automatic acquisition of huge training data for bio-medical named entity recognition

BioNLP '11 Proceedings of BioNLP 2011 Workshop
Generating links to background knowledge: a case study using narrative radiology reports

Proceedings of the 20th ACM international conference on Information and knowledge management
Biomedical named entities recognition using conditional random fields model

FSKD'06 Proceedings of the Third international conference on Fuzzy Systems and Knowledge Discovery
Headwords and suffixes in biomedical names

KDLL'06 Proceedings of the 2006 international conference on Knowledge Discovery in Life Science Literature
A greek named-entity recognizer that uses support vector machines and active learning

SETN'06 Proceedings of the 4th Helenic conference on Advances in Artificial Intelligence
Two-phase biomedical named entity recognition using a hybrid method

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
A grid infrastructure for text mining of full text articles and creation of a knowledge base of gene relations

ISBMDA'05 Proceedings of the 6th International conference on Biological and Medical Data Analysis
Exploring predicate-argument relations for named entity recognition in the molecular biology domain

DS'05 Proceedings of the 8th international conference on Discovery Science
A generic classifier-ensemble approach for biomedical named entity recognition

PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
A multi-strategy approach to biological named entity recognition

Expert Systems with Applications: An International Journal
A hybrid approach to gene ranking using gene relation networks derived from literature for the identification of disease gene markers

International Journal of Data Mining and Bioinformatics
Biomedical named entity recognition: a poor knowledge HMM-based approach

NLDB'07 Proceedings of the 12th international conference on Applications of Natural Language to Information Systems
Factors affecting the effectiveness of biomedical document indexing and retrieval based on terminologies

Artificial Intelligence in Medicine
Methodological Review: Biomedical text mining and its applications in cancer research

Journal of Biomedical Informatics
Unsupervised biomedical named entity recognition: Experiments with clinical and biological texts

Journal of Biomedical Informatics

Quantified Score

Hi-index	0.02

Visualization

Abstract

We explore the use of Support Vector Machines (SVMs) for biomedical named entity recognition. To make the SVM training with the available largest corpus - the GENIA corpus - tractable, we propose to split the non-entity class into sub-classes, using part-of-speech information. In addition, we explore new features such as word cache and the states of an HMM trained by unsupervised learning. Experiments on the GENIA corpus show that our class splitting technique not only enables the training with the GENIA corpus but also improves the accuracy. The proposed new features also contribute to improve the accuracy. We compare our SVM-based recognition system with a system using Maximum Entropy tagging method.