Efficient processing of containment queries on nested sets

Authors:
Ahmed Ibrahim;George H. L. Fletcher
Affiliations:
Eindhoven University of Technology, The Netherlands;Eindhoven University of Technology, The Netherlands
Venue:
Proceedings of the 16th International Conference on Extending Database Technology
Year:
2013

Citing 32
Cited 0

Managing gigabytes (2nd ed.): compressing and indexing documents and images

Managing gigabytes (2nd ed.): compressing and indexing documents and images
Faster Subtree Isomorphism

Journal of Algorithms
Joining nested relations and subrelations

Information Systems
On the complexity of join predicates

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
On supporting containment queries in relational database management systems

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Algorithms on Trees and Graphs

Algorithms on Trees and Graphs
Set Containment Joins: The Good, The Bad and The Ugly

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Adaptive algorithms for set containment joins

ACM Transactions on Database Systems (TODS)
Efficient processing of joins on set-valued attributes

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Processing frequent itemset discovery queries by division and set containment join operators

DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
A performance study of four index structures for set-valued attributes of low cardinality

The VLDB Journal — The International Journal on Very Large Data Bases
Efficient set joins on similarity predicates

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
On the integration of structure indexes and inverted lists

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Optimizing cursor movement in holistic twig joins

Proceedings of the 14th ACM international conference on Information and knowledge management
A combination of trie-trees and inverted files for the indexing of set-valued attributes

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
On the complexity of division and set joins in the relational algebra

Journal of Computer and System Sciences
SQL query optimization through nested relational algebra

ACM Transactions on Database Systems (TODS)
Efficiently Querying Large XML Data Repositories: A Survey

IEEE Transactions on Knowledge and Data Engineering
STXXL: standard template library for XXL data sets

Software—Practice & Experience
Pig latin: a not-so-foreign language for data processing

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Approximate Joins for Data-Centric XML

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Efficient algorithms for descendant-only tree pattern queries

Information Systems
Towards a theory of search queries

ACM Transactions on Database Systems (TODS)
Efficient set intersection for inverted indexing

ACM Transactions on Information Systems (TOIS)
Dremel: interactive analysis of web-scale datasets

Proceedings of the VLDB Endowment
Set similarity join on probabilistic data

Proceedings of the VLDB Endowment
Efficient answering of set containment queries for skewed item distributions

Proceedings of the 14th International Conference on Extending Database Technology
Foundations of Semantic Web databases

Journal of Computer and System Sciences
Efficient processing of probabilistic set-containment queries on uncertain set-valued data

Information Sciences: an International Journal
Faster bit-parallel algorithms for unordered pseudo-tree matching and tree homeomorphism

Journal of Discrete Algorithms
Measuring structural similarity of semistructured data based on information-theoretic approaches

The VLDB Journal — The International Journal on Very Large Data Bases
A Survey of XML Tree Patterns

IEEE Transactions on Knowledge and Data Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

We study the problem of computing containment queries on sets which can have both atomic and set-valued objects as elements, i.e., nested sets. Containment is a fundamental query pattern with many basic applications. Our study of nested set containment is motivated by the ubiquity of nested data in practice, e.g., in XML and JSON data management, in business and scientific workflow management, and in web analytics. Furthermore, there are to our knowledge no known efficient solutions to computing containment queries on massive collections of nested sets. Our specific contributions in this paper are: (1) we introduce two novel algorithms for efficient evaluation of containment queries on massive collections of nested sets; (2) we study caching and filtering mechanisms to accelerate query processing in the algorithms; (3) we develop extensions to the algorithms to a) compute several related query types and b) accommodate natural variations of the semantics of containment; and, (4) we present analytic and empirical analyses which demonstrate that both algorithms are efficient and scalable.