Combining sample selection and error-driven pruning for machine learning of coreference rules

  • Authors:
  • Vincent Ng;Claire Cardie

  • Affiliations:
  • Cornell University, Ithaca, NY;Cornell University, Ithaca, NY

  • Venue:
  • EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most machine learning solutions to noun phrase coreference resolution recast the problem as a classification task. We examine three potential problems with this reformulation, namely, skewed class distributions, the inclusion of "hard" training instances, and the loss of transitivity inherent in the original coreference relation. We show how these problems can be handled via intelligent sample selection and error-driven pruning of classification rule-sets. The resulting system achieves an F-measure of 69.5 and 63.4 on the MUC-6 and MUC-7 coreference resolution data sets, respectively, surpassing the performance of the best MUC-6 and MUC-7 coreference systems. In particular, the system outperforms the best-performing learning-based coreference system to date.