Bi-directional Joint Inference for Entity Resolution and Segmentation Using Imperatively-Defined Factor Graphs

  • Authors:
  • Sameer Singh;Karl Schultz;Andrew Mccallum

  • Affiliations:
  • Department of Computer Science, University of Massachusetts, Amherst, USA 01002;Department of Computer Science, University of Massachusetts, Amherst, USA 01002;Department of Computer Science, University of Massachusetts, Amherst, USA 01002

  • Venue:
  • ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

There has been growing interest in using joint inference across multiple subtasks as a mechanism for avoiding the cascading accumulation of errors in traditional pipelines. Several recent papers demonstrate joint inference between the segmentation of entity mentions and their de-duplication, however, they have various weaknesses: inference information flows only in one direction, the number of uncertain hypotheses is severely limited, or the subtasks are only loosely coupled. This paper presents a highly-coupled, bi-directional approach to joint inference based on efficient Markov chain Monte Carlo sampling in a relational conditional random field. The model is specified with our new probabilistic programming language that leverages imperative constructs to define factor graph structure and operation. Experimental results show that our approach provides a dramatic reduction in error while also running faster than the previous state-of-the-art system.