View maintenance in a warehousing environment
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
The art of computer programming, volume 3: (2nd ed.) sorting and searching
The art of computer programming, volume 3: (2nd ed.) sorting and searching
An adaptive query execution system for data integration
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Maintenance of materialized views: problems, techniques, and applications
Materialized views
Efficient resumption of interrupted warehouse loads
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Database Management Systems
Performance Issues in Incremental Warehouse Maintenance
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Efficient Snapshot Differential Algorithms for Data Warehousing
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Hash-Merge Join: A Non-blocking Join Algorithm for Producing Fast and Early Join Results
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
ETL queues for active data warehousing
Proceedings of the 2nd international workshop on Information quality in information systems
Early hash join: a configurable algorithm for the efficient and early production of join results
VLDB '05 Proceedings of the 31st international conference on Very large data bases
The Long Tail: Why the Future of Business Is Selling Less of More
The Long Tail: Why the Future of Business Is Selling Less of More
Processing sliding window multi-joins in continuous queries over data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Partition-based workload scheduling in living data warehouse environments
Proceedings of the ACM tenth international workshop on Data warehousing and OLAP
Query processing of multi-way stream window joins
The VLDB Journal — The International Journal on Very Large Data Bases
Meshing Streaming Updates with Persistent Data in an Active Data Warehouse
IEEE Transactions on Knowledge and Data Engineering
An Event-Based Near Real-Time Data Integration Architecture
EDOCW '08 Proceedings of the 2008 12th Enterprise Distributed Object Computing Conference Workshops
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Resource optimization for processing of stream data in data warehouse environment
Proceedings of the International Conference on Advances in Computing, Communications and Informatics
Towards benchmarking stream data warehouses
Proceedings of the fifteenth international workshop on Data warehousing and OLAP
Optimised X-HYBRIDJOIN for near-real-time data warehousing
ADC '12 Proceedings of the Twenty-Third Australasian Database Conference - Volume 124
Hi-index | 0.00 |
An important component of near-real-time data warehouses is the near-real-time integration layer. One important element in near-real-time data integration is the join of a continuous input data stream with a disk-based relation. For high-throughput streams, stream-based algorithms, such as Mesh Join MESHJOIN, can be used. However, in MESHJOIN the performance of the algorithm is inversely proportional to the size of disk-based relation. The Index Nested Loop Join INLJ can be set up so that it processes stream input, and can deal with intermittences in the update stream but it has low throughput. This paper introduces a robust stream-based join algorithm called Hybrid Join HYBRIDJOIN, which combines the two approaches. A theoretical result shows that HYBRIDJOIN is asymptotically as fast as the fastest of both algorithms. The authors present performance measurements of the implementation. In experiments using synthetic data based on a Zipfian distribution, HYBRIDJOIN performs significantly better for typical parameters of the Zipfian distribution, and in general performs in accordance with the theoretical model while the other two algorithms are unacceptably slow under different settings.