Approximate mining of frequent patterns on streams

Authors:
Claudio Silvestri;Salvatore Orlando
Affiliations:
Dipartimento di Informatica, Università Ca' Foscari, Via Torino 155, Venezia, Italy. E-mail: {silvestri,orlando}@dsi.unive.it;Dipartimento di Informatica, Università Ca' Foscari, Via Torino 155, Venezia, Italy. E-mail: {silvestri,orlando}@dsi.unive.it
Venue:
Intelligent Data Analysis - Knowlegde Discovery from Data Streams
Year:
2007

Citing 34
Cited 11

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Fast sequential and parallel algorithms for association rule mining: a comparison

Fast sequential and parallel algorithms for association rule mining: a comparison
An effective hash-based algorithm for mining association rules

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Advances in knowledge discovery and data mining

Advances in knowledge discovery and data mining
Using association rules for product assortment decisions: a case study

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
A tree projection algorithm for generation of frequent item sets

Journal of Parallel and Distributed Computing - Special issue on high-performance data mining
Communication-efficient distributed mining of association rules

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
A fast distributed algorithm for mining association rules

DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Models and issues in data stream systems

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Adaptive Sampling Methods for Scaling Up Knowledge Discovery Algorithms

Data Mining and Knowledge Discovery
Mining association rules using inverted hashing and pruning

Information Processing Letters
Parallel and Distributed Association Mining: A Survey

IEEE Concurrency
Mining Very Large Databases

Computer
Parallel Mining of Association Rules

IEEE Transactions on Knowledge and Data Engineering
Scalable Parallel Data Mining for Association Rules

IEEE Transactions on Knowledge and Data Engineering
Scalable Algorithms for Association Mining

IEEE Transactions on Knowledge and Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
An Efficient Algorithm for Mining Association Rules in Large Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Sampling Large Databases for Association Rules

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Finding Frequent Items in Data Streams

ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
Frequency Estimation of Internet Packet Streams with Limited Space

ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
A simple algorithm for finding frequent elements in streams and bags

ACM Transactions on Database Systems (TODS)
What's hot and what's not: tracking most frequent items dynamically

Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Adaptive and Resource-Aware Mining of Frequent Sets

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Efficient Progressive Sampling for Association Rules

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Finding the most interesting patterns in a database quickly by using sequential sampling

The Journal of Machine Learning Research
Association Rule Mining in Peer-to-Peer Systems

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Finding recent frequent itemsets adaptively over online data streams

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
A new algorithm for gap constrained sequence mining

Proceedings of the 2004 ACM symposium on Applied computing
Distributed approximate mining of frequent patterns

Proceedings of the 2005 ACM symposium on Applied computing
An improved data stream summary: the count-min sketch and its applications

Journal of Algorithms
An Algorithm for In-Core Frequent Itemset Mining on Streaming Data

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Approximate frequency counts over data streams

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases

Frequent spatio-temporal patterns in trajectory data warehouses

Proceedings of the 2009 ACM symposium on Applied Computing
Models for association rules based on clustering and correlation

Intelligent Data Analysis
Sliding window-based frequent pattern mining over data streams

Information Sciences: an International Journal
A Sliding Window Algorithm for Relational Frequent Patterns Mining from Data Streams

DS '09 Proceedings of the 12th International Conference on Discovery Science
Approximate Frequent Itemset Discovery from Data Stream

AI*IA '09: Proceedings of the XIth International Conference of the Italian Association for Artificial Intelligence Reggio Emilia on Emergent Perspectives in Artificial Intelligence
Algorithms for mining frequent itemsets in static and dynamic datasets

Intelligent Data Analysis
Approximate weighted frequent pattern mining with/without noisy environments

Knowledge-Based Systems
Evaluating association rules and decision trees to predict multiple target attributes

Intelligent Data Analysis
A dynamic layout of sliding window for frequent itemset mining over data streams

Journal of Systems and Software
Incremental Algorithm for Discovering Frequent Subsequences in Multiple Data Streams

International Journal of Data Warehousing and Mining
Stream mining on univariate uncertain data

Applied Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many critical applications, like intrusion detection or stock market analysis, require a nearly immediate result based on a continuous and infinite stream of data. In most cases finding an exact solution is not compatible with limited availability of resources and real time constraints, but an approximation of the exact result is enough for most purposes. This paper introduces a new algorithm for approximate mining of frequent itemsets from streams of transactions using a limited amount of memory. The proposed algorithm is based on the computation of frequent itemsets in recent data and an effective method for inferring the global support of previously infrequent itemsets. Both upper and lower bounds on the support of each pattern found are returned along with the interpolated support. An extensive experimental evaluation shows that ${AP}_{stream}$, the proposed algorithm, yields a good approximation of the exact global result considering both the set of patterns found and their supports.