An Automated Integration Approach for Semi-Structured and Structured Data

  • Authors:
  • Seung-Jin Lim;Yiu-Kai Ng

  • Affiliations:
  • -;-

  • Venue:
  • CODAS '01 Proceedings of the Third International Symposium on Cooperative Database Systems for Advanced Applications
  • Year:
  • 2001

Quantified Score

Hi-index 0.01

Visualization

Abstract

As data access beyond traditional intranet boundary is popular on the Internet these days, the demand for an integrated and uniform method for accessing Web data sources that are different in structures and semantics is increasing. This demand is partly driven by users who want to access more diverse information, such as up-to-date information on stock market, entertainment, news, and science. The demand is also partly driven by information providers who provide information service to customers on the Web. In this paper, we present an approach to integrate semi-structured data sources and structured data sources by using an automated structure resolution approach. The structure resolution approach can easily be adopted to i) integrate existing relations in the relational database model into semi-structured data sources, and ii) merge sets of semi-structured data that have different structures with no human intervention. The integration of multiple data sources by using our approach results in the unified view (UV) of the data sources, which is presented in an XML DTD format. UV can be used for query optimization on heterogeneous data sources.