A semi-automatic approach for building ontologies from acollection of structured web documents

  • Authors:
  • Mouna Kamel;N. Aussenac-Gilles;Davide Buscaldi;Catherine Comparot

  • Affiliations:
  • Université de Toulouse, Toulouse, France;Université de Toulouse, Toulouse, France;LIPN-Univ. Paris Nord, Villetaneuse, France;Université de Toulouse, Toulouse, France

  • Venue:
  • Proceedings of the seventh international conference on Knowledge capture
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many collections of structured documents are available on the web. The collection generally describes the characteristics of entities from a single type, where each page describes one entity. These documents are adequate knowledge sources for building ontologies. As they benefit from a strong and shared layout, they contain less well written text than plain text files but their architecture is very meaningful. Classical linguistic-based methods for identifying concepts and relations are no longer appropriate for analyzing them.The approach we propose in this paper exploits various properties of such documents, combining layout/formatting analysis and linguistic analysis, and using semantic annotation.