Automatic acquisition of taxonomies in different languages from multiple Wikipedia versions

  • Authors:
  • Renato Domínguez García;Christoph Rensing;Ralf Steinmetz

  • Affiliations:
  • Multimedia Communications Lab TU Darmstadt, Darmstadt, Germany;Multimedia Communications Lab TU Darmstadt, Darmstadt, Germany;Multimedia Communications Lab TU Darmstadt, Darmstadt, Germany

  • Venue:
  • i-KNOW '11 Proceedings of the 11th International Conference on Knowledge Management and Knowledge Technologies
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the last years, the vision of the Semantic Web has led to many approaches that aim to automatically derive knowledge bases from Wikipedia. These approaches rely mostly on the English Wikipedia as it is the largest Wikipedia version and have lead to valuable knowledge bases. However, each Wikipedia version contains socio-cultural knowledge, i.e. knowledge with specific relevance for a culture or language. One difficulty of the application of existing approaches to multiple Wikipedia versions is the use of additional corpora. In this paper, we describe the adaptation of existing heuristics that make the extraction of large sets of hyponymy relations from multiple Wikipedia versions with little information about each language possible. Further, we evaluate our approach with Wikipedia versions in four different languages and compare results with GermaNet for German and WordNet for English.