Fuzzy semantic tagging and flexible querying of XML documents extracted from the Web

  • Authors:
  • Patrice Buche;Juliette Dibie-Barthélemy;Ollivier Haemmerlé;Gaëlle Hignette

  • Affiliations:
  • INRA, Département Mathématiques et Informatique Appliquées, Unité Mét@risk, Paris, Cedex 05 F-75231;INRA, Département Mathématiques et Informatique Appliquées, Unité Mét@risk, Paris, Cedex 05 F-75231;GRIMM-ISYCOM,Département de Mathématiques-Informatique, Université de Toulouse le Mirail, Toulouse Cedex F-31058;INRA, Département Mathématiques et Informatique Appliquées, Unité Mét@risk, Paris, Cedex 05 F-75231

  • Venue:
  • Journal of Intelligent Information Systems
  • Year:
  • 2006

Quantified Score

Hi-index 0.01

Visualization

Abstract

The relational database model is widely used in real applications. We propose a way of complementing such a database with an XML data warehouse. The approach we propose is generic, and driven by a domain ontology. The XML data warehouse is built from data extracted from the Web, which are semantically tagged using terms belonging to the domain ontology. The semantic tagging is fuzzy, since, instead of tagging the values of the Web document with one value of the domain ontology, we propose to use tags expressed in terms of a possibility distribution representing a set of possible terms, each term being weighted by a possibility degree. The querying of the XML data warehouse is also fuzzy: the end-users can express their preferences by means of fuzzy selection criteria. We present our approach on a first application domain: predictive microbiology.