Lexical patterns, features and knowledge resources for coreference resolution in clinical notes

  • Authors:
  • Phil Gooch;Abdul Roudsari

  • Affiliations:
  • Centre for Health Informatics, City University, London, UK;Centre for Health Informatics, City University, London, UK and School of Health Information Science, University of Victoria, BC, Canada

  • Venue:
  • Journal of Biomedical Informatics
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Generation of entity coreference chains provides a means to extract linked narrative events from clinical notes, but despite being a well-researched topic in natural language processing, general-purpose coreference tools perform poorly on clinical texts. This paper presents a knowledge-centric and pattern-based approach to resolving coreference across a wide variety of clinical records from two corpora (Ontology Development and Information Extraction (ODIE) and i2b2/VA), and describes a method for generating coreference chains using progressively pruned linked lists that reduces the search space and facilitates evaluation by a number of metrics. Independent evaluation results give an F-measure for each corpus of 79.2% and 87.5%, respectively. A baseline of blind coreference of mentions of the same class gives F-measures of 65.3% and 51.9% respectively. For the ODIE corpus, recall is significantly improved over the baseline (p0.05). For the i2b2/VA corpus, recall, precision, and F-measure are significantly improved over the baseline (p