Join processing in database systems with large main memories
ACM Transactions on Database Systems (TODS)
Dataflow query execution in a parallel main-memory environment
PDIS '91 Proceedings of the first international conference on Parallel and distributed information systems
Performance Issues in Incremental Warehouse Maintenance
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Potter's Wheel: An Interactive Data Cleaning System
Proceedings of the 27th International Conference on Very Large Data Bases
ETL queues for active data warehousing
Proceedings of the 2nd international workshop on Information quality in information systems
Early hash join: a configurable algorithm for the efficient and early production of join results
VLDB '05 Proceedings of the 31st international conference on Very large data bases
The Long Tail: Why the Future of Business Is Selling Less of More
The Long Tail: Why the Future of Business Is Selling Less of More
Meshing Streaming Updates with Persistent Data in an Active Data Warehouse
IEEE Transactions on Knowledge and Data Engineering
An Event-Based Near Real-Time Data Integration Architecture
EDOCW '08 Proceedings of the 2008 12th Enterprise Distributed Object Computing Conference Workshops
X-HYBRIDJOIN for near-real-time data warehousing
BNCOD'11 Proceedings of the 28th British national conference on Advances in databases
HYBRIDJOIN for Near-Real-Time Data Warehousing
International Journal of Data Warehousing and Mining
Hi-index | 0.00 |
To fulfill the increasing demand of business for the latest information, current data integration approaches are moving towards real-time updates. In the case of real-time data integration the updates occurring on the source systems need to be reflected in the data warehouse immediately. One important element in real-time data integration is the join of a continuous incoming data stream with a disk-based master data. In this context a stream-based algorithm called X-HYBRIDJOIN (Extended Hybrid Join) has been proposed earlier, with a favorable asymptotic runtime behavior. However, the absolute performance was not as good as hoped for. In this paper we present results showing that through properly tuning the algorithm, the resulting Tuned X-HYBRIDJOIN performs significantly better than that of the previous X-HYBRIDJOIN, and better as other applicable join operators found in literature. We present the tuning approach, based on measurement techniques and a revised cost model. To evaluate the algorithm's performance we conduct an experimental study that shows that Tuned X-HYBRIDJOIN exhibits the desired performance characteristics.