Resource optimization for processing of stream data in data warehouse environment

  • Authors:
  • M. Asif Naeem;Gillian Dobbie;Imran Sarwar Bajwa;Gerald Weber

  • Affiliations:
  • Auckland University of Technology, Auckland, New Zealand;The University of Auckland, Auckland, New Zealand;University of Birmingham UK;The University of Auckland, Auckland, New Zealand

  • Venue:
  • Proceedings of the International Conference on Advances in Computing, Communications and Informatics
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

To fulfill the increasing demand of business for the latest information, current data integration approaches are moving towards real-time updates. In the case of real-time data integration the updates occurring on the source systems need to be reflected in the data warehouse immediately. One important element in real-time data integration is the join of a continuous incoming data stream with a disk-based master data. In this context a stream-based algorithm called X-HYBRIDJOIN (Extended Hybrid Join) has been proposed earlier, with a favorable asymptotic runtime behavior. However, the absolute performance was not as good as hoped for. In this paper we present results showing that through properly tuning the algorithm, the resulting Tuned X-HYBRIDJOIN performs significantly better than that of the previous X-HYBRIDJOIN, and better as other applicable join operators found in literature. We present the tuning approach, based on measurement techniques and a revised cost model. To evaluate the algorithm's performance we conduct an experimental study that shows that Tuned X-HYBRIDJOIN exhibits the desired performance characteristics.