R-MESHJOIN for near-real-time data warehousing
DOLAP '10 Proceedings of the ACM 13th international workshop on Data warehousing and OLAP
X-HYBRIDJOIN for near-real-time data warehousing
BNCOD'11 Proceedings of the 28th British national conference on Advances in databases
Resource optimization for processing of stream data in data warehouse environment
Proceedings of the International Conference on Advances in Computing, Communications and Informatics
A lightweight stream-based join with limited resource consumption
DaWaK'12 Proceedings of the 14th international conference on Data Warehousing and Knowledge Discovery
HYBRIDJOIN for Near-Real-Time Data Warehousing
International Journal of Data Warehousing and Mining
A generic front-stage for semi-stream processing
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Active XML-based Web data integration
Information Systems Frontiers
Hi-index | 0.00 |
Extract-Transform-Load (ETL) tools feed data from operational databases into data warehouses. Traditionally, these ETL tools use batch processing and operate offline at regular time intervals, for example on a nightly or weekly basis. Naturally, users prefer to have up-to-date data to make their decisions, therefore there is a demand for real-time ETL tools. In this paper we investigate an event-based near real-time ETL layer for transferring and transforming data from the operational database to the data warehouse. One of our main concerns in this paper is master data management in the ETL layer. We present the architecture of a novel, general purpose, event-driven, and near real-time ETL layer that uses a Database Queue (DBQ), works on a push technology principle and directly supports content enrichment. We also observe that the system architecture is consistent with the information architecture of a classical Online Transaction Processing (OLTP) application, allowing us to distinguish between different kinds of data to increase the clarity of the design. Keywords: event-based architecture, content enrichment, master data, extract-transform-load, enterprise service bus.