Towards exhaustive protein modification event extraction

  • Authors:
  • Sampo Pyysalo;Tomoko Ohta;Makoto Miwa;Jun'ichi Tsujii

  • Affiliations:
  • University of Tokyo, Tokyo, Japan;University of Tokyo, Tokyo, Japan;University of Tokyo, Tokyo, Japan;Microsoft Research Asia, Beijing, China

  • Venue:
  • BioNLP '11 Proceedings of BioNLP 2011 Workshop
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Protein modifications, in particular post-translational modifications, have a central role in bringing about the full repertoire of protein functions, and the identification of specific protein modifications is important for understanding biological systems. This task presents a number of opportunities for the automatic support of manual curation efforts. However, the sheer number of different types of protein modifications is a daunting challenge for automatic extraction that has so far not been met in full, with most studies focusing on single modifications or a few prominent ones. In this work, aim to meet this challenge: we analyse protein modification types through ontologies, databases, and literature and introduce a corpus of 360 abstracts manually annotated in the BioNLP Shared Task event representation for over 4500 mentions of proteins and 1000 statements of modification events of nearly 40 different types. We argue that together with existing resources, this corpus provides sufficient coverage of modification types to make effectively exhaustive extraction of protein modifications from text feasible.