A New Extraction Concept Based on Contextual Clustering

  • Authors:
  • Lobna Karoui;Marie-Aude Aufaure;Nacera Bennacer

  • Affiliations:
  • Ecole Superieure d'Electricite, France;Ecole Superieure d'Electricite, France;Ecole Superieure d'Electricite, France

  • Venue:
  • CIMCA '06 Proceedings of the International Conference on Computational Inteligence for Modelling Control and Automation and International Conference on Intelligent Agents Web Technologies and International Commerce
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Ontologies provide a common layer that plays a major role in information exchange and support sharing. Ontologies proliferation relies strongly on the automation of their building, integration and deployment processes. In this paper, we present an integrated framework involving complementary dimensions to drive the (semi) automatic acquisition conceptual knowledge process from HTML Web pages. Our approach takes advantage from structural HTML document features and the word location to identify the appropriate term context. Our context definition improves word weighting, the selection of the semantically closer cooccurrents and the relevant extracted ontological concepts. We use an unsupervised clustering method for term groups' generation. Notice that the chosen clustering method relies on a user incremental quality evaluation process. In this paper and after a theoretical presentation of our structural contextual definition, we summarize the most significant results obtained by applying our method on a corpus dedicated to the tourism domain. The first results show how the definition of an appropriate context improves the relevance of the extracted concepts.