Using prefix-trees for efficiently computing set joins

Authors:
Ravindranath Jampani;Vikram Pudi
Affiliations:
Center for Data Engineering, International Institute of Information Technology, Hyderabad, India;Center for Data Engineering, International Institute of Information Technology, Hyderabad, India
Venue:
DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications
Year:
2005

Citing 12
Cited 2

Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
On the complexity of join predicates

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Real world performance of association rule algorithms

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Object Relational DBMSs: The Next Great Wave

Object Relational DBMSs: The Next Great Wave
Divide-and-Conquer Algorithm for Computing Set Containment Joins

EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Set Containment Joins: The Good, The Bad and The Ugly

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Evaluation of Main Memory Join Algorithms for Joins with Set Comparison Join Predicates

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Adaptive algorithms for set containment joins

ACM Transactions on Database Systems (TODS)
Efficient processing of joins on set-valued attributes

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Processing frequent itemset discovery queries by division and set containment join operators

DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Efficient set joins on similarity predicates

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data

Prefix tree indexing for similarity search and similarity joins on genomic data

SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
Efficient similarity search in very large string sets

SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Joins on set-valued attributes (set joins) have numerous database applications. In this paper we propose PRETTI (PREfix Tree based seT joIn) – a suite of set join algorithms for containment, overlap and equality join predicates. Our algorithms use prefix trees and inverted indices. These structures are constructed on-the-fly if they are not already precomputed. This feature makes our algorithms usable for relations without indices and when joining intermediate results during join queries with more than two relations. Another feature of our algorithms is that results are output continuously during their execution and not just at the end. Experiments on real life datasets show that the total execution time of our algorithms is significantly less than that of previous approaches, even when the indices required by our algorithms are not precomputed.