Query optimization in xml-based information integration

  • Authors:
  • Dongfeng Chen;Rada Chirkova;Maxim Kormilitsin;Fereidoon Sadri;Timo J. Salo

  • Affiliations:
  • NC State University, Raleigh, NC, USA;NC State University, Raleigh, NC, USA;NC State University, Raleigh, NC, USA;UNC-Greensboro, Greensboro, NC, USA;IBM RTP, Research Triangle Park, NC, USA

  • Venue:
  • Proceedings of the 17th ACM conference on Information and knowledge management
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The problem of decentralized data sharing is relevant for a wide range of applications and is still a source of major theoretical and practical challenges, in spite of many years of sustained research in information integration. We focus on the challenge of efficiency of query evaluation in information-integration systems, with the objective of developing query-processing strategies that are widely applicable and easy to implement in real-life applications. In our algorithms we take into account important features of today's data-sharing applications, namely: XML as likely interface to or representation for data sources; potential for information overlap across data sources; and the need for inter-source processing (i.e., joins of data across data sources) in many applications. To the best of our knowledge, our methods are the first to account for the practical issues of information overlap across data sources and of inter-source processing. While most of our algorithms are platform- and implementation-independent, we also propose XML-specific optimization techniques that allow for system-level tuning of query processing performance. Finally, using real-life datasets and our implementation of an information-integration system shell, we provide experimental results that demonstrate that our algorithms are efficient and competitive in the information-integration setting. For all the details, please see [1].