A proximity measure and a clustering method for concept extraction in an ontology building perspective

  • Authors:
  • Guillaume Cleuziou;Sylvie Billot;Stanislas Lew;Lionel Martin;Christel Vrain

  • Affiliations:
  • Laboratoire d'Informatique Fondamentale (LIFO), Université d'Orléans, Orléans, France;Laboratoire d'Informatique Fondamentale (LIFO), Université d'Orléans, Orléans, France;Laboratoire d'Informatique Fondamentale (LIFO), Université d'Orléans, Orléans, France;Laboratoire d'Informatique Fondamentale (LIFO), Université d'Orléans, Orléans, France;Laboratoire d'Informatique Fondamentale (LIFO), Université d'Orléans, Orléans, France

  • Venue:
  • ISMIS'06 Proceedings of the 16th international conference on Foundations of Intelligent Systems
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we study the problem of clustering textual units in the framework of helping an expert to build a specialized ontology. This work has been achieved in the context of a French project, called Biotim, handling botany corpora. Building an ontology, either automatically or semi-automatically is a difficult task. We focus on one of the main steps of that process, namely structuring the textual units occurring in the texts into classes, likely to represent concepts of the domain. The approach that we propose relies on the definition of a new non-symmetrical measure for evaluating the semantic proximity between lemma, taking into account the contexts in which they occur in the documents. Moreover, we present a non-supervised classification algorithm designed for the task at hand and that kind of data. The first experiments performed on botanical data have given relevant results.