Protein name tagging for biomedical annotation in text

Authors:
Kaoru Yamamoto;Taku Kudo;Akihiko Konagaya;Yuji Matsumoto
Affiliations:
The Institute of Physical and Chemical Research, Suehiro-cho, Tsurumi-ku, Yokohama, Japan;Nara Institute of Science and Technology, Ikoma, Nara, Japan;The Institute of Physical and Chemical Research, Suehiro-cho, Tsurumi-ku, Yokohama, Japan;Nara Institute of Science and Technology, Ikoma, Nara, Japan
Venue:
BioMed '03 Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine - Volume 13
Year:
2003

Citing 5
Cited 21

Foundations of statistical natural language processing

Foundations of statistical natural language processing
Language independent morphological analysis

ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Representing text chunks

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Extracting the names of genes and gene products with a hidden Markov model

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Tuning support vector machines for biomedical named entity recognition

BioMed '02 Proceedings of the ACL-02 workshop on Natural language processing in the biomedical domain - Volume 3

A text-mining system for knowledge discovery from biomedical documents

IBM Systems Journal
Gene name identification and normalization using a model organism database

Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Comparison of character-level and part of speech features for name recognition in biomedical texts

Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Biomedical named entity recognition using two-phase model based on SVMs

Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Enhancing performance of protein and gene name recognizers with filtering and integration strategies

Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Term identification in the biomedical literature

Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Extracting regulatory gene expression networks from PubMed

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Named entity recognition in Vietnamese using classifier voting

ACM Transactions on Asian Language Information Processing (TALIP)
Identification of gene function using prediction by partial matching (PPM) language models

Proceedings of the 17th ACM conference on Information and knowledge management
Using argumentation to retrieve articles with similar citations from MEDLINE

JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Annotating multiple types of biomedical entities: a single word classification approach

JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
POSBIOTM-NER in the shared task of BioNLP/NLPBA 2004

JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
How to make the most of NE dictionaries in statistical NER

BioNLP '08 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
BioNoculars: extracting protein-protein interactions from biomedical text

BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
BaseNPs that contain gene names: domain specificity and genericity

BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
A preliminary look into the use of named entity information for bioscience text tokenization

HLT-SRWS '04 Proceedings of the Student Research Workshop at HLT-NAACL 2004
Two learning approaches for protein name extraction

Journal of Biomedical Informatics
Recognizing medication related entities in hospital discharge summaries using support vector machine

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Methodological Review: Natural Language Processing methods and systems for biomedical ontology learning

Journal of Biomedical Informatics
SVM-Based biological named entity recognition using minimum edit-distance feature boosted by virtual examples

IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
By all these lovely tokens... Merging conflicting tokenizations

Language Resources and Evaluation

Quantified Score

Hi-index	0.00

Visualization

Abstract

We explore the use of morphological analysis as preprocessing for protein name tagging. Our method finds protein names by chunking based on a morpheme, the smallest unit determined by the morphological analysis. This helps to recognize the exact boundaries of protein names. Moreover, our morphological analyzer can deal with compounds. This offers a simple way to adapt name descriptions from biomedical resources for language processing. Using GENIA corpus 3.01, our method attains f-score of 70 points for protein molecule names, and 75 points for protein names including molecules, families and domains.