Towards Near Real-Time Data Warehousing

  • Authors:
  • Li Chen;Wenny Rahayu;David Taniar

  • Affiliations:
  • -;-;-

  • Venue:
  • AINA '10 Proceedings of the 2010 24th IEEE International Conference on Advanced Information Networking and Applications
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

A data warehouse is built as a layer on top of existing operational database systems. Once built, it has to be regularly updated (refreshed). Currently, most data warehouse approaches employ static refresh mechanisms whereby updates are based on a static timestamp, eg. once every day/week/quarter only. Whilst for some systems this might be adequate, others require a more rigorous approach ensuring that analysis is always 'up-to-date'. Static time interval for refreshing data warehouse is not adequate enough for systems with high update frequency. A real-time data warehouse incorporates operational data changes in real time. However, sometimes, it is often unnecessary or even inefficient to immediately refresh and send updates from the operational database into a data warehouse. In this paper, we propose a near real-time refresh mechanism that takes into consideration a number of measures: (i) Impact from record, (ii) Number of records affected, and (iii) Frequency Request Measure. The combination of these measures can accurately identify when the data warehouse needs to be strictly real-time, or near real-time (ie. right-time). Our experimentation shows that the proposed approach offers a significant benefit in terms of refresh operation cost in comparison to real-time warehousing, while at the same time still maintaining a high freshness level of the data warehouse.