Automated construction of domain ontology taxonomies from wikipedia

  • Authors:
  • Damir Jurić;Marko Banek;Zoran Skočir

  • Affiliations:
  • University of Zagreb, Faculty of Electrical Engineering and Computing, Zagreb, Croatia;University of Zagreb, Faculty of Electrical Engineering and Computing, Zagreb, Croatia;University of Zagreb, Faculty of Electrical Engineering and Computing, Zagreb, Croatia

  • Venue:
  • DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part II
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The key step for implementing the idea of the Semantic Web into a feasible system is providing a variety of domain ontologies that are constructed on demand, in an automated manner and in a very short time. In this paper we introduce an unsupervised method for constructing domain ontology taxonomies from Wikipedia. The benefit of using Wikipedia as the source is twofold: first, the Wikipedia articles are concise and have a particularly high "density" of domain knowledge; second, the articles represent a consensus of a large community, thus avoiding term disagreements and misinterpretations. The taxonomy construction algorithm, aimed at finding the subsumption relation, is based on two different techniques, which both apply linguistic parsing: analyzing the first sentence of each Wikipedia article and processing the categories associated with the article. The method has been evaluated against human judgment for two independent domains and the experimental results have proven its robustness and high precision.