Extraction of temporal facts and events from Wikipedia

  • Authors:
  • Erdal Kuzey;Gerhard Weikum

  • Affiliations:
  • Max Planck Institute for Informatics, Saarbrücken, Germany;Max Planck Institute for Informatics, Saarbrücken, Germany

  • Venue:
  • Proceedings of the 2nd Temporal Web Analytics Workshop
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recently, large-scale knowledge bases have been constructed by automatically extracting relational facts from text. Unfortunately, most of the current knowledge bases focus on static facts and ignore the temporal dimension. However, the vast majority of facts are evolving with time or are valid only during a particular time period. Thus, time is a significant dimension that should be included in knowledge bases. In this paper, we introduce a complete information extraction framework that harvests temporal facts and events from semi-structured data and free text of Wikipedia articles to create a temporal ontology. First, we extend a temporal data representation model by making it aware of events. Second, we develop an information extraction method which harvests temporal facts and events from Wikipedia infoboxes, categories, lists, and article titles in order to build a temporal knowledge base. Third, we show how the system can use its extracted knowledge for further growing the knowledge base. We demonstrate the effectiveness of our proposed methods through several experiments. We extracted more than one million temporal facts with precision over 90% for extraction from semi-structured data and almost 70% for extraction from text.