Interactive source registration in community-oriented information integration

  • Authors:
  • Yannis Katsis;Alin Deutsch;Yannis Papakonstantinou

  • Affiliations:
  • UC San Diego;UC San Diego;UC San Diego

  • Venue:
  • Proceedings of the VLDB Endowment
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Modern Internet communities need to integrate and query structured information. Employing current information integration infrastructure, data integration is still a very costly effort, since source registration is performed by a central authority which becomes a bottleneck. We propose the community-based integration paradigm which pushes the source registration task to the independent community members. This creates new challenges caused by each community member's lack of a global overview on how her data interacts with the application queries of the community and the data from other sources. How can the source owner maximize the visibility of her data to existing applications, while minimizing the clean-up and reformatting cost associated with publishing? Does her data contradict (or could it contradict in the future) the data of other sources? We introduce RIDE, a visual registration tool that extends schema mapping interfaces like that of MS Biz Talk Server and IBM's Clio with a suggestion component that guides the source owner in the autonomous registration, assisting her in answering these questions. RIDE's implementation features efficient procedures for deciding various levels of self-reliance of a GLAV-style source registration for contributing answers to an application query and checking potential and definite inconsistency across sources.