Robust disambiguation of named entities in text

  • Authors:
  • Johannes Hoffart;Mohamed Amir Yosef;Ilaria Bordino;Hagen Fürstenau;Manfred Pinkal;Marc Spaniol;Bilyana Taneva;Stefan Thater;Gerhard Weikum

  • Affiliations:
  • Max Planck Institute for Informatics, Saarbrücken, Germany;Max Planck Institute for Informatics, Saarbrücken, Germany;Yahoo! Research Lab, Barcelona, Spain;Saarland University, Saarbrücken, Germany;Saarland University, Saarbrücken, Germany;Max Planck Institute for Informatics, Saarbrücken, Germany;Max Planck Institute for Informatics, Saarbrücken, Germany;Saarland University, Saarbrücken, Germany;Max Planck Institute for Informatics, Saarbrücken, Germany

  • Venue:
  • EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Disambiguating named entities in natural-language text maps mentions of ambiguous names onto canonical entities like people or places, registered in a knowledge base such as DBpedia or YAGO. This paper presents a robust method for collective disambiguation, by harnessing context from knowledge bases and using a new form of coherence graph. It unifies prior approaches into a comprehensive framework that combines three measures: the prior probability of an entity being mentioned, the similarity between the contexts of a mention and a candidate entity, as well as the coherence among candidate entities for all mentions together. The method builds a weighted graph of mentions and candidate entities, and computes a dense subgraph that approximates the best joint mention-entity mapping. Experiments show that the new method significantly outperforms prior methods in terms of accuracy, with robust behavior across a variety of inputs.