Multilingual extraction of semantic indexes

  • Authors:
  • Catherine Roussey;Sylvie Calabretto;Farah Harrathi

  • Affiliations:
  • Université de Lyon, Villeurbanne Cedex;INSA Lyon LIRIS CNRS UMR, Villeurbanne Cedex;INSA Lyon LIRIS CNRS UMR, Villeurbanne Cedex

  • Venue:
  • SADPI '07 Proceedings of the 2007 international workshop on Semantically aware document processing and indexing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This article deals with multilingual document indexing. We propose an indexing method based on several stages. First of all the most important terms of the document are extracted using general characteristics of languages and statistical methods. Thus, term extraction stages can be applied to any document whatever the document language is. Secondly, our indexing method uses a multilingual ontology in order to find the most relevant concepts representing the document content. Our method can be applied to a multilingual corpus containing document written in different languages. This indexing procedure is part of a Multilingual Document System untitled SyDoM, which manages XML documents.