Contextual ontological concepts extraction

  • Authors:
  • Lobna Karoui;Nacéra Bennacer;Marie-Aude Aufaure

  • Affiliations:
  • Ecole Supérieure d'Electricité, Gif-sur-Yvette, France;Ecole Supérieure d'Electricité, Gif-sur-Yvette, France;Ecole Supérieure d'Electricité, Gif-sur-Yvette, France

  • Venue:
  • DS'06 Proceedings of the 9th international conference on Discovery Science
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Ontologies provide a common layer which plays a major role in supporting information exchange and sharing. In this paper, we focus on the ontological concept extraction process from HTML documents. We propose an unsupervised hierarchical clustering algorithm namely “Contextual Ontological Concept Extraction” (COCE) which is an incremental use of a partitioning algorithm and is guided by a structural context. This context exploits the html structure and the location of words to select the semantically closer cooccurrents for each word and to improve the words weighting. Guided by this context definition, we perform an incremental clustering that refines the words' context of each cluster to obtain semantic extracted concepts. The COCE algorithm offers the choice between either an automatic execution or an interactive one. We experiment the COCE algorithm on French documents related to the tourism. Our results show how the execution of our context-based algorithm improves the relevance of the clusters' conceptual quality.