From information to knowledge: harvesting entities and relationships from web sources

  • Authors:
  • Gerhard Weikum;Martin Theobald

  • Affiliations:
  • Max Planck Institute for Informatics, Saarbruecken, Germany;Max Planck Institute for Informatics, Saarbruecken, Germany

  • Venue:
  • Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

There are major trends to advance the functionality of search engines to a more expressive semantic level. This is enabled by the advent of knowledge-sharing communities such as Wikipedia and the progress in automatically extracting entities and relationships from semistructured as well as natural-language Web sources. Recent endeavors of this kind include DBpedia, EntityCube, KnowItAll, ReadTheWeb, and our own YAGO-NAGA project (and others). The goal is to automatically construct and maintain a comprehensive knowledge base of facts about named entities, their semantic classes, and their mutual relations as well as temporal contexts, with high precision and high recall. This tutorial discusses state-of-the-art methods, research opportunities, and open challenges along this avenue of knowledge harvesting.