Bi-directional Joint Inference for Entity Resolution and Segmentation Using Imperatively-Defined Factor Graphs

Authors:
Sameer Singh;Karl Schultz;Andrew Mccallum
Affiliations:
Department of Computer Science, University of Massachusetts, Amherst, USA 01002;Department of Computer Science, University of Massachusetts, Amherst, USA 01002;Department of Computer Science, University of Massachusetts, Amherst, USA 01002
Venue:
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Year:
2009

Citing 14
Cited 12

Automatic labeling of semantic roles

Computational Linguistics
Adaptive duplicate detection using learnable string similarity measures

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
A novel use of statistical parsing to extract information from text

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
An integrated, conditional model of information extraction and coreference with application to citation matching

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Markov logic networks

Machine Learning
Entity Resolution with Markov Logic

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Sound and efficient inference with probabilistic and deterministic dependencies

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Solving the problem of cascading errors: approximate Bayesian inference for linguistic annotation pipelines

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Joint inference in information extraction

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
A general method for reducing the complexity of relational inference and its application to MCMC

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Joint parsing and named entity recognition

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
A machine learning approach to building domain-specific search engines

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Joint parsing and semantic role labeling

CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Factor graphs and the sum-product algorithm

IEEE Transactions on Information Theory

Constraint-driven rank-based learning for information extraction

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Collective cross-document relation extraction without labelled data

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Modeling relations and their mentions without labeled text

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
Jointly identifying entities and extracting relations in encyclopedia text via a graphical model approach

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Matching unstructured product offers to structured product specifications

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Towards a top-down and bottom-up bidirectional approach to joint information extraction

Proceedings of the 20th ACM international conference on Information and knowledge management
Aggregating web offers to determine product prices

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Monte Carlo MCMC: efficient inference by approximate sampling

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Monte Carlo MCMC: efficient inference by sampling factors

AKBC-WEKEX '12 Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction
Collective information extraction with context-specific consistencies

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Joint inference of entities, relations, and coreference

Proceedings of the 2013 workshop on Automated knowledge base construction
Type Extension Trees for feature construction and learning in relational domains

Artificial Intelligence

Quantified Score

Hi-index	0.01

Visualization

Abstract

There has been growing interest in using joint inference across multiple subtasks as a mechanism for avoiding the cascading accumulation of errors in traditional pipelines. Several recent papers demonstrate joint inference between the segmentation of entity mentions and their de-duplication, however, they have various weaknesses: inference information flows only in one direction, the number of uncertain hypotheses is severely limited, or the subtasks are only loosely coupled. This paper presents a highly-coupled, bi-directional approach to joint inference based on efficient Markov chain Monte Carlo sampling in a relational conditional random field. The model is specified with our new probabilistic programming language that leverages imperative constructs to define factor graph structure and operation. Experimental results show that our approach provides a dramatic reduction in error while also running faster than the previous state-of-the-art system.