A corpus of clinical narratives annotated with temporal information

  • Authors:
  • Lucian Galescu;Nate Blaylock

  • Affiliations:
  • Florida Institute for Human and Machine Cognition, Pensacola, FL, USA;Florida Institute for Human and Machine Cognition, Pensacola, FL, USA

  • Venue:
  • Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clinical reports often include descriptions of events in the patient's medical history, as well as explicit or implicit temporal information about these events. We are working towards applying deep Natural Language Processing tools towards understanding such narratives. This requires both the extraction and classification of the relevant events, and the placing of those events in time, or at least in relation to one another. Although several corpora of news data exist that have been annotated using the TimeML schema, similar corpora of clinical reports are not readily available. In this paper we report on the design of a small corpus and the annotation schema we developed, based on data from the fourth i2b2/VA challenge. These data include, among others, annotations for medical problems, tests, and treatments in clinical reports from several healthcare institutions. We have selected a subset of clinical reports and added annotations similar to those used in the TempEval tasks for the annotation of events, time expressions and temporal relations for the news domain. The annotations have been made freely available to the research community.