Efficient resumption of interrupted warehouse loads
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
AJAX: an extensible data cleaning tool
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Striving towards Near Real-Time Data Integration for Data Warehouses
DaWaK 2000 Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery
A framework for the design of ETL scenarios
CAiSE'03 Proceedings of the 15th international conference on Advanced information systems engineering
Hi-index | 0.00 |
ETL processes are sometimes interrupted by occurrence of a failure. In such a case, one of the interrupted extraction resumption algorithms is usually used. In this paper we present a modified Design-Resume (DR) algorithm enriched by the possibility of handling ETL processes containing many loading nodes. We use the DR algorithm to resume a parallel data warehouse load process. The key feature of this algorithm is that it does not impose additional overhead on the normal ETL process. In our work we modify the algorithm to work with more than one loading node, which increases the efficiency of the resumption process. Based on the results of performed tests, the benefits of our improvements are discussed.