Annotation of chemical named entities

  • Authors:
  • Peter Corbett;Colin Batchelor;Simone Teufel

  • Affiliations:
  • Cambridge University, Cambridge, UK;Thomas Graham House, Cambridge, UK;University of Cambridge, UK

  • Venue:
  • BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe the annotation of chemical named entities in scientific text. A set of annotation guidelines defines 5 types of named entities, and provides instructions for the resolution of special cases. A corpus of fulltext chemistry papers was annotated, with an inter-annotator agreement F score of 93%. An investigation of named entity recognition using LingPipe suggests that F scores of 63% are possible without customisation, and scores of 74% are possible with the addition of custom tokenisation and the use of dictionaries.