CONE: metrics for automatic evaluation of named entity co-reference resolution

  • Authors:
  • Bo Lin;Rushin Shah;Robert Frederking;Anatole Gershman

  • Affiliations:
  • Carnegie Mellon University, PA;Carnegie Mellon University, PA;Carnegie Mellon University, PA;Carnegie Mellon University, PA

  • Venue:
  • NEWS '10 Proceedings of the 2010 Named Entities Workshop
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Human annotation for Co-reference Resolution (CRR) is labor intensive and costly, and only a handful of annotated corpora are currently available. However, corpora with Named Entity (NE) annotations are widely available. Also, unlike current CRR systems, state-of-the-art NER systems have very high accuracy and can generate NE labels that are very close to the gold standard for unlabeled corpora. We propose a new set of metrics collectively called CONE for Named Entity Co-reference Resolution (NE-CRR) that use a subset of gold standard annotations, with the advantage that this subset can be easily approximated using NE labels when gold standard CRR annotations are absent. We define CONE B3 and CONE CEAF metrics based on the traditional B3 and CEAF metrics and show that CONE B3 and CONE CEAF scores of any CRR system on any dataset are highly correlated with its B3 and CEAF scores respectively. We obtain correlation factors greater than 0.6 for all CRR systems across all datasets, and a best-case correlation factor of 0.8. We also present a baseline method to estimate the gold standard required by CONE metrics, and show that CONE B3 and CONE CEAF scores using this estimated gold standard are also correlated with B3 and CEAF scores respectively. We thus demonstrate the suitability of CONE B3 and CONE CEAF for automatic evaluation of NE-CRR.