Information retrieval
Evaluation of signature files as set access facilities in OODBs
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
In situ generation of compressed inverted files
Journal of the American Society for Information Science
S-tree: a dynamic balanced signature index for office retrieval
Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
Inverted files versus signature files for text indexing
ACM Transactions on Database Systems (TODS)
Closest pair queries in spatial databases
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Object Relational DBMSs: The Next Great Wave
Object Relational DBMSs: The Next Great Wave
R-trees: a dynamic index structure for spatial searching
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
An introduction to spatial database systems
The VLDB Journal — The International Journal on Very Large Data Bases - Spatial Database Systems
Performance Analysis of Three Text-Join Algorithms
IEEE Transactions on Knowledge and Data Engineering
Set Containment Joins: The Good, The Bad and The Ugly
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
An Efficient Indexing Technique for Full Text Databases
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Searching Large Lexicons for Partially Specified Terms using Compressed Inverted Files
VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
Evaluation of Main Memory Join Algorithms for Joins with Set Comparison Join Predicates
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Processing frequent itemset discovery queries by division and set containment join operators
DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Efficient set joins on similarity predicates
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
On the complexity of division and set joins in the relational algebra
Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A nested relational approach to processing SQL subqueries
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Adaptive load shedding for windowed stream joins
Proceedings of the 14th ACM international conference on Information and knowledge management
Efficient exact set-similarity joins
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
A combination of trie-trees and inverted files for the indexing of set-valued attributes
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Duplicate Record Detection: A Survey
IEEE Transactions on Knowledge and Data Engineering
On the complexity of division and set joins in the relational algebra
Journal of Computer and System Sciences
SQL query optimization through nested relational algebra
ACM Transactions on Database Systems (TODS)
Symmetric relations and cardinality-bounded multisets in database systems
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Parallelizing query optimization
Proceedings of the VLDB Endowment
Ed-Join: an efficient algorithm for similarity joins with edit distance constraints
Proceedings of the VLDB Endowment
Approximate substring selectivity estimation
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
On indexing error-tolerant set containment
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
ASSET queries: a declarative alternative to MapReduce
ACM SIGMOD Record
Generalizing prefix filtering to improve set similarity joins
Information Systems
A hybrid index structure for set-valued attributes using itemset tree and inverted list
DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part I
An efficient similarity join algorithm with cosine similarity predicate
DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part II
Relative expressive power of navigational querying on graphs
Proceedings of the 14th International Conference on Database Theory
Efficient answering of set containment queries for skewed item distributions
Proceedings of the 14th International Conference on Extending Database Technology
Using prefix-trees for efficiently computing set joins
DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications
Efficient main-memory algorithms for set containment join using inverted lists
ADBIS'05 Proceedings of the 9th East European conference on Advances in Databases and Information Systems
Efficient processing of probabilistic set-containment queries on uncertain set-valued data
Information Sciences: an International Journal
SWIM '12 Proceedings of the 4th International Workshop on Semantic Web Information Management
FoIKS'12 Proceedings of the 7th international conference on Foundations of Information and Knowledge Systems
Indexing dataspaces with partitions
World Wide Web
Efficient processing of containment queries on nested sets
Proceedings of the 16th International Conference on Extending Database Technology
Accelerating gene context analysis using bitmaps
Proceedings of the 25th International Conference on Scientific and Statistical Database Management
Efficient implementation of generalized quantification in relational query languages
Proceedings of the VLDB Endowment
External memory K-bisimulation reduction of big graphs
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
Object-oriented and object-relational DBMS support set valued attributes, which are a natural and concise way to model complex information. However, there has been limited research to-date on the evaluation of query operators that apply on sets. In this paper we study the join of two relations on their set-valued attributes. Various join types are considered, namely the set containment, set equality, and set overlap joins. We show that the inverted file, a powerful index for selection queries, can also facilitate the efficient evaluation of most join predicates. We propose join algorithms that utilize inverted files and compare them with signature-based methods for several set-comparison predicates.