Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
On the boosting ability of top-down decision tree learning algorithms
STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
New sampling-based summary statistics for improving approximate query answers
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Machine Learning - The Eleventh Annual Conference on computational Learning Theory
FreeSpan: frequent pattern-projected sequential pattern mining
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining long sequential patterns in a noisy environment
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Levelwise Search and Borders of Theories in KnowledgeDiscovery
Data Mining and Knowledge Discovery
Adaptive Sampling Methods for Scaling Up Knowledge Discovery Algorithms
Data Mining and Knowledge Discovery
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Maintenance of Discovered Association Rules in Large Databases: An Incremental Updating Technique
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
The PSP Approach for Mining Sequential Patterns
PKDD '98 Proceedings of the Second European Symposium on Principles of Data Mining and Knowledge Discovery
Sampling Large Databases for Association Rules
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Finding Frequent Items in Data Streams
ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
Frequency Estimation of Internet Packet Streams with Limited Space
ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
A simple algorithm for finding frequent elements in streams and bags
ACM Transactions on Database Systems (TODS)
What's hot and what's not: tracking most frequent items dynamically
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Issues in data stream management
ACM SIGMOD Record
Finding the most interesting patterns in a database quickly by using sequential sampling
The Journal of Machine Learning Research
Dynamically maintaining frequent items over a data stream
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
IncSpan: incremental mining of sequential patterns in large database
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
IEEE Transactions on Pattern Analysis and Machine Intelligence
Framework and algorithms for trend analysis in massive temporal data sets
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding Window
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
A regression-based temporal pattern mining scheme for data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
A lower bound on the sample size needed to perform a significant frequent pattern mining task
Pattern Recognition Letters
Multivariable stream data classification using motifs and their temporal relations
Information Sciences: an International Journal
Experimental study on fighters behaviors mining
Expert Systems with Applications: An International Journal
Hi-index | 0.01 |
A data stream is a potentially uninterrupted flow of data. Mining this flow makes it necessary to cope with uncertainty, as only a part of the stream can be stored. In this paper, we evaluate a statistical technique which biases the estimation of the support of patterns, so as to maximize either the precision or the recall, as chosen by the user, and limit the degradation of the other criterion. Theoretical results show that the technique is not far from the optimum, from the statistical standpoint. Experiments performed tend to demonstrate its potential, as it remains robust even under significant distribution drifts.