Fast algorithms for universal quantification in large databases
ACM Transactions on Database Systems (TODS)
Set-oriented data mining in relational databases
Data & Knowledge Engineering
Integrating association rule mining with relational database systems: alternatives and implications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Data mining: concepts and techniques
Data mining: concepts and techniques
On supporting containment queries in relational database management systems
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Real world performance of association rule algorithms
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Algorithms and applications for universal quantification in relational databases
Information Systems - Special issue: Best papers from EDBT 2002
Divide-and-Conquer Algorithm for Computing Set Containment Joins
EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
A Requirements Analysis for Parallel KDD Systems
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Set Containment Joins: The Good, The Bad and The Ugly
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Storage and Querying of E-Commerce Data
Proceedings of the 27th International Conference on Very Large Data Bases
XXL - A Library Approach to Supporting Efficient Implementations of Advanced Database Queries
Proceedings of the 27th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Adaptive algorithms for set containment joins
ACM Transactions on Database Systems (TODS)
Efficient processing of joins on set-valued attributes
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Efficient storage and query processing of set-valued attributes
Efficient storage and query processing of set-valued attributes
SQL based frequent pattern mining without candidate generation
Proceedings of the 2004 ACM symposium on Applied computing
Horizontal aggregations for building tabular data sets
Proceedings of the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Depth-first frequent itemset mining in relational databases
Proceedings of the 2005 ACM symposium on Applied computing
Programming relational databases for Itemset mining over large transactional tables
EPIA'05 Proceedings of the 12th Portuguese conference on Progress in Artificial Intelligence
Logic-Based association rule mining in XML documents
APWeb'06 Proceedings of the 2006 international conference on Advanced Web and Network Technologies, and Applications
Using prefix-trees for efficiently computing set joins
DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications
SQL based frequent pattern mining with FP-Growth
INAP'04/WLP'04 Proceedings of the 15th international conference on Applications of Declarative Programming and Knowledge Management, and 18th international conference on Workshop on Logic Programming
Shaping SQL-Based frequent pattern mining algorithms
KDID'05 Proceedings of the 4th international conference on Knowledge Discovery in Inductive Databases
Efficient processing of containment queries on nested sets
Proceedings of the 16th International Conference on Extending Database Technology
Hi-index | 0.00 |
SQL-based data mining algorithms are rarely used in practice today. Most performance experiments have shown that SQL-based approaches are inferior to main-memory algorithms. Nevertheless, database vendors try to integrate analysis functionalities to some extent into their query execution and optimization components in order to narrow the gap between data and processing. Such a database support is particularly important when data mining applicatons need to analyze very large datasets or when they need access current data, not a possibly outdated copy of it.We investigate approaches based on SQL for the problem of finding frequent itemsets in a transaction table, including an algorithm that we recently proposed, called Quiver, which employs universal and existential quantifications. This approach employs a table schema for itemsets that is similar to the commonly used vertical layout for transactions: each item of an itemset is stored in a separate row. We argue that expressing the frequent itemset discovery problem using quantifications offers interesting opportunities to process such queries using set containment join or set containment division operators, which are not yet available in commercial database systems. Initial performance experiments reveal that Quiver cannot be processed efficiently by commercial DBMS. However, our experiments with query execution plans that use operators realizing set containment tests suggest that an efficient processing of Quiver is possible.