Fuzzy Annotation of Web Data Tables Driven by a Domain Ontology

  • Authors:
  • Gaëlle Hignette;Patrice Buche;Juliette Dibie-Barthélemy;Ollivier Haemmerlé

  • Affiliations:
  • INRA/AgroParisTech Unité Mét@risk, Université de Toulouse le Mirail, Paris Cedex 5, France F-75231;INRA/AgroParisTech Unité Mét@risk, Université de Toulouse le Mirail, Paris Cedex 5, France F-75231;INRA/AgroParisTech Unité Mét@risk, Université de Toulouse le Mirail, Paris Cedex 5, France F-75231;INRA/AgroParisTech Unité Mét@risk, Université de Toulouse le Mirail, Paris Cedex 5, France F-75231

  • Venue:
  • ESWC 2009 Heraklion Proceedings of the 6th European Semantic Web Conference on The Semantic Web: Research and Applications
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose an automatic system for annotating accurately data tables extracted from the web. This system is designed to provide additional data to an existing querying system called MIEL, which relies on a common vocabulary used to query local relational databases. We will use the same vocabulary, translated into an OWL ontology, to annotate the tables. Our annotation system is unsupervised. It uses only the knowledge defined in the ontology to automatically annotate the entire content of tables, using an aggregation approach: first annotate cells, then columns, then relations between those columns. The annotations are fuzzy: instead of linking an element of the table with a precise concept of the ontology, the elements of the table are annotated with several concepts, associated with their relevance degree. Our annotation process has been validated experimentally on scientific domains (microbial risk in food, chemical risk in food) and a technical domain (aeronautics).