A localized algorithm for parallel association mining
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Mining long sequential patterns in a noisy environment
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
An efficient and effective algorithm for density biased sampling
Proceedings of the eleventh international conference on Information and knowledge management
Parallel Algorithms for Discovery of Association Rules
Data Mining and Knowledge Discovery
A Survey of Methods for Scaling Up Inductive Algorithms
Data Mining and Knowledge Discovery
Scalable Algorithms for Association Mining
IEEE Transactions on Knowledge and Data Engineering
Parallel GA-Based Wrapper Feature Selection for Spectroscopic Data Mining
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Towards Network-Aware Data Mining
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Clustering Distributed Homogeneous Datasets
PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Enhancing the Apriori Algorithm for Frequent Set Counting
DaWaK '01 Proceedings of the Third International Conference on Data Warehousing and Knowledge Discovery
Data Reduction via Conflicting Data Analysis
ISMIS '00 Proceedings of the 12th International Symposium on Foundations of Intelligent Systems
Scheduling High Performance Data Mining Tasks on a Data Grid Environment
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Formal Logics of Discovery and Hypothesis Formation by Machine
DS '98 Proceedings of the First International Conference on Discovery Science
Parallel and Distributed Data Mining: An Introduction
Revised Papers from Large-Scale Parallel Data Mining, Workshop on Large-Scale Parallel KDD Systems, SIGKDD
Optimized Disjunctive Association Rules via Sampling
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Statistical properties of transactional databases
Proceedings of the 2004 ACM symposium on Applied computing
Iterative Projected Clustering by Subspace Mining
IEEE Transactions on Knowledge and Data Engineering
Compression, Clustering, and Pattern Discovery in Very High-Dimensional Discrete-Attribute Data Sets
IEEE Transactions on Knowledge and Data Engineering
Elastic Translation Invariant Matching of Trajectories
Machine Learning
Indexed-based density biased sampling for clustering applications
Data & Knowledge Engineering
Multi-scaling sampling: an adaptive sampling method for discovering approximate association rules
Journal of Computer Science and Technology
Quality-Aware Sampling and Its Applications in Incremental Data Mining
IEEE Transactions on Knowledge and Data Engineering
The VLDB Journal — The International Journal on Very Large Data Bases
A survey on algorithms for mining frequent itemsets over data streams
Knowledge and Information Systems
Feature-preserved sampling over streaming data
ACM Transactions on Knowledge Discovery from Data (TKDD)
Identifying appropriate methodologies and strategies for vertical mining with incomplete data
WSEAS Transactions on Computers
Analysis of sampling techniques for association rule mining
Proceedings of the 12th International Conference on Database Theory
A lower bound on the sample size needed to perform a significant frequent pattern mining task
Pattern Recognition Letters
Vertical mining with incomplete data
MAMECTIS'08 Proceedings of the 10th WSEAS international conference on Mathematical methods, computational techniques and intelligent systems
Efficient Frequent Itemsets Mining by Sampling
Proceedings of the 2006 conference on Advances in Intelligent IT: Active Media Technology 2006
Journal of Data and Information Quality (JDIQ)
Which Is Better for Frequent Pattern Mining: Approximate Counting or Sampling?
DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
A test paradigm for detecting changes in transactional data streams
DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
Focusing solutions for data mining: analytical studies and experimental results in real-world domains
Frequent subgraph mining on a single large graph using sampling techniques
Proceedings of the Eighth Workshop on Mining and Learning with Graphs
Mining top-K frequent itemsets through progressive sampling
Data Mining and Knowledge Discovery
Intelligent Data Analysis - Ubiquitous Knowledge Discovery
Discovery of frequent patterns in transactional data streams
Transactions on large-scale data- and knowledge-centered systems II
Discovery of frequent patterns in transactional data streams
Transactions on large-scale data- and knowledge-centered systems II
Locality sensitive hashing for sampling-based algorithms in association rule mining
Expert Systems with Applications: An International Journal
Direct local pattern sampling by efficient two-step random procedures
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Parallel mining of maximal sequential patterns using multiple samples
The Journal of Supercomputing
On exploring the power-law relationship in the itemset support distribution
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Sampling ensembles for frequent patterns
FSKD'05 Proceedings of the Second international conference on Fuzzy Systems and Knowledge Discovery - Volume Part I
Progressive sampling for association rules based on sampling error estimation
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Effective sampling for mining association rules
AI'04 Proceedings of the 17th Australian joint conference on Advances in Artificial Intelligence
ML-DS: a novel deterministic sampling algorithm for association rules mining
ICDM'12 Proceedings of the 12th Industrial conference on Advances in Data Mining: applications and theoretical aspects
PARMA: a parallel randomized algorithm for approximate association rules mining in MapReduce
Proceedings of the 21st ACM international conference on Information and knowledge management
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
A new parallel association rule mining algorithm on distributed shared memory system
International Journal of Business Intelligence and Data Mining
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
CrowdMiner: mining association rules from the crowd
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
The discovery of association rules is a prototypical problem in data mining. The current algorithms proposed for data mining of association rules make repeated passes over the database to determine the commonly occurring item sets (or set of items). For large databases, the I/O overhead in scanning the database can be extremely high. The authors show that random sampling of transactions in the database is an effective method for finding association rules. Sampling can speed up the mining process by more than an order of magnitude by reducing I/O costs and drastically shrinking the number of transactions to be considered. They may also be able to make the sampled database resident in main-memory. Furthermore, they show that sampling can accurately represent the data patterns in the database with high confidence. They experimentally evaluate the effectiveness of sampling on different databases, and study the relationship between the performance, accuracy, and confidence of the chosen sample.