Machine Learning
Dynamic itemset counting and implication rules for market basket data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Scalable parallel data mining for association rules
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
New sampling-based summary statistics for improving approximate query answers
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Mining features for sequence classification
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns by pattern-growth: methodology and implications
ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth
Proceedings of the 17th International Conference on Data Engineering
Mining Generalized Association Rules
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Robust and distributed top-n frequent-pattern mining with SAP BW accelerator
Proceedings of the VLDB Endowment
Rule synthesizing from multiple related databases
PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Hi-index | 0.00 |
Computing the frequent subsets of large multi-attribute data is a key component of local pattern detection data mining algorithms. It is both computation- and data-intensive. The standard parallel algorithms require multiple passes through the data. The cost of data access may easily outweigh any performance gained by parallelizing the computational part. We address two opportunities for performance improvement: using a parallel approximate algorithm that requires only a single pass over the data; and using a probabilistic technique to avoid generating most of the lattice of subsets implied by each object's data. The computation required is only slightly greater than levelwise algorithms, but the amount of data access is much smaller.