Notions of correctness when evaluating protein name taggers

Authors:
Fredrik Olsson;Gunnar Eriksson;Kristofer Franzén;Lars Asker;Per Lidén
Affiliations:
Swedish Institute of Computer Science, Kista, Sweden;Swedish Institute of Computer Science, Kista, Sweden;Swedish Institute of Computer Science, Kista, Sweden;Virtual Genetics Laboratory AB, Stockholm, Sweden;Virtual Genetics Laboratory AB, Stockholm, Sweden
Venue:
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Year:
2002

Citing 4
Cited 13

Automatic Extraction of Biological Information from Scientific Text: Protein-Protein Interactions

Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
A non-projective dependency parser

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
The GENIA project: corpus-based knowledge acquisition and information extraction from genome research papers

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Extracting the names of genes and gene products with a hidden Markov model

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1

A Probabilistic Model for Identifying Protein Names and their Name Boundaries

CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Introduction: named entity recognition in biomedicine

Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Biomedical named entity recognition using two-phase model based on SVMs

Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Enhancing performance of protein and gene name recognizers with filtering and integration strategies

Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
A hybrid approach to protein name identification in biomedical texts

Information Processing and Management: an International Journal
Enhancing performance of protein name recognizers using collocation

BioMed '03 Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine - Volume 13
Evaluation of techniques for increasing recall in a dictionary approach to gene and protein name identification

Journal of Biomedical Informatics
Using heuristics, syntax and a local dynamic dictionary for protein name tagging

HLT '02 Proceedings of the second international conference on Human Language Technology Research
Exploiting the contextual cues for bio-entity name recognition in biomedical literature

Journal of Biomedical Informatics
@Note: A workbench for Biomedical Text Mining

Journal of Biomedical Informatics
Support vector machine approach to extracting gene references into function from biological documents

JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Annotating multiple types of biomedical entities: a single word classification approach

JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Empirical textual mining to protein entities recognition from pubmed corpus

NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper introduces four different notions of correctness to be used when measuring the performance of protein name taggers, each of which reflects certain characteristics of the tagger under evaluation. The discussion regarding the different notions is centered around the evaluation of two protein name taggers; Yapex, developed by the authors, and KeX developed by Fukuda et al. (1998). For the purpose of illustrating the difference between the ways of evaluation, both taggers are applied to a test corpus of 101 MEDLINE abstracts in which all occurrences of protein names have been marked up by domain experts.