A survey of approaches to automatic schema matching
The VLDB Journal — The International Journal on Very Large Data Bases
Extracting structured data from Web pages
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Towards the self-annotating web
Proceedings of the 13th international conference on World Wide Web
Profile-Based Object Matching for Information Integration
IEEE Intelligent Systems
The MIEL++ architecture when RDB, CGs and XML meet for the sake of risk assessment in food products
ICCS'06 Proceedings of the 14th international conference on Conceptual Structures: inspiration and Application
Approximate querying of XML fuzzy data
FQAS'06 Proceedings of the 7th international conference on Flexible Query Answering Systems
Hi-index | 0.00 |
Our work deals with the automatic construction of domain specific data warehouses. Our application domain concerns microbiological risks in food products. The MIEL++ system [2], implemented during the Sym'Previus project, is a tool based on a database containing experimental and industrial results about the behavior of pathogenic germs in food products. This database is incomplete by nature since the number of possible experiments is potentially infinite. Our work, developed within the e.dot project, presents a way of palliating that incompleteness by complementing the database with data automatically extracted from the Web. We propose to query these data through a mediated architecture based on a domain ontology. So, we need to make them compatible with the ontology. In the e.dot project [5], we exclusively focus on documents in Html or Pdf format which contain data tables. Data tables are very common presentation scheme to describe synthetic data in scientific articles. These tables are semantically enriched and we want this enrichment to be as automatic and flexible as possible. Thus, we have defined a Document Type Definition named SML (Semantic Markup Language) which can deal with additional or incomplete information in a semantic relation, ambiguities or possible interpretation errors. In this paper, we present this semantic enrichment step.