Efficiently mining long patterns from databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Depth first generation of long patterns
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Levelwise Search and Borders of Theories in KnowledgeDiscovery
Data Mining and Knowledge Discovery
New Results on Monotone Dualization and Generating Hypergraph Transversals
SIAM Journal on Computing
Efficient Discovery of Functional Dependencies and Armstrong Relations
EDBT '00 Proceedings of the 7th International Conference on Extending Database Technology: Advances in Database Technology
Pincer Search: A New Algorithm for Discovering the Maximum Frequent Set
EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases
Proceedings of the 17th International Conference on Data Engineering
FUN: An Efficient Algorithm for Mining Functional and Embedded Dependencies
ICDT '01 Proceedings of the 8th International Conference on Database Theory
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
DaWaK '01 Proceedings of the Third International Conference on Data Warehousing and Knowledge Discovery
Discovering all most specific sentences
ACM Transactions on Database Systems (TODS)
gSpan: Graph-Based Substructure Pattern Mining
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach
Data Mining and Knowledge Discovery
Fast Algorithms for Frequent Itemset Mining Using FP-Trees
IEEE Transactions on Knowledge and Data Engineering
GenMax: An Efficient Algorithm for Mining Maximal Frequent Itemsets
Data Mining and Knowledge Discovery
A Thorough Experimental Study of Datasets for Frequent Itemsets
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Parallel Leap: Large-Scale Maximal Pattern Mining in a Distributed Environment
ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 1
Cache-conscious frequent pattern mining on modern and emerging processors
The VLDB Journal — The International Journal on Very Large Data Bases
Optimization of frequent itemset mining on multiple-core processor
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Efficient mining of maximal frequent itemsets from databases on a cluster of workstations
Knowledge and Information Systems
A view selection algorithm with performance guarantee
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Emerging Cubes: Borders, size estimations and lossless reductions
Information Systems
Standing Out in a Crowd: Selecting Attributes for Maximum Visibility
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Knowledge and Information Systems
Constructing and exploring composite items
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Discovering Conditional Functional Dependencies
IEEE Transactions on Knowledge and Data Engineering
Hi-index | 0.00 |
The border concept has been introduced by Mannila and Toivonen in their seminal paper [20]. This concept finds many applications, e.g maximal frequent itemsets, minimal functional dependencies, emerging patterns between consecutive database instances and materialized view selection. For large transactions and relational databases defined on n items or attributes, the running time of any border computations are mainly dominated by the time T (for standard sequential algorithms) required to test the interestingness, in general the frequencies, of sets of candidates. In this paper we propose a general parallel algorithm for computing borders whatever the application is. We prove the efficiency of our algorithm by showing that: (i) it generates exactly the same number of candidates as the standard sequential algorithm and, (ii) if the interestingness test time of a candidate is bounded by Δ then for a multi-processor shared memory machine with p cores, we prove that the total interestingness time Tp