Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
New sampling-based summary statistics for improving approximate query answers
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Incremental and interactive sequence mining
Proceedings of the eighth international conference on Information and knowledge management
Machine Learning - The Eleventh Annual Conference on computational Learning Theory
FreeSpan: frequent pattern-projected sequential pattern mining
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
SPADE: an efficient algorithm for mining frequent sequences
Machine Learning
Mining long sequential patterns in a noisy environment
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Adaptive Sampling Methods for Scaling Up Knowledge Discovery Algorithms
Data Mining and Knowledge Discovery
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Maintenance of Discovered Association Rules in Large Databases: An Incremental Updating Technique
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
A Fast, Bottom-Up Decision Tree Pruning Algorithm with Near-Optimal Generalization
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
The PSP Approach for Mining Sequential Patterns
PKDD '98 Proceedings of the Second European Symposium on Principles of Data Mining and Knowledge Discovery
Sampling Large Databases for Association Rules
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Finding Frequent Items in Data Streams
ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
What's hot and what's not: tracking most frequent items dynamically
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Issues in data stream management
ACM SIGMOD Record
Incremental mining of sequential patterns in large databases
Data & Knowledge Engineering
Finding the most interesting patterns in a database quickly by using sequential sampling
The Journal of Machine Learning Research
Mining concept-drifting data streams using ensemble classifiers
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Dynamically maintaining frequent items over a data stream
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
IncSpan: incremental mining of sequential patterns in large database
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
IEEE Transactions on Pattern Analysis and Machine Intelligence
Framework and algorithms for trend analysis in massive temporal data sets
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding Window
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
A regression-based temporal pattern mining scheme for data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Sequential pattern mining algorithm for automotive warranty data
Computers and Industrial Engineering
A lower bound on the sample size needed to perform a significant frequent pattern mining task
Pattern Recognition Letters
Incremental Algorithm for Discovering Frequent Subsequences in Multiple Data Streams
International Journal of Data Warehousing and Mining
Hi-index | 0.00 |
Recently, the knowledge extraction community takes a closer look at new models where data arrive in timely manner like a fast and continuous flow, i.e. data streams. As only a part of the stream can be stored, mining data streams for sequential patterns and updating previously found frequent patterns need to cope with uncertainty. In this paper, we introduce a new statistical approach which biases the initial support for sequential patterns. This approach holds the advantage to maximize either the precision or the recall, as chosen by the user, and limit the degradation of the other criterion. Moreover, these statistical supports help building statistical borders which are the relevant sets of frequent patterns to use into an incremental mining process. From the statistical standpoint, theoretical results show that the technique is not far from the optimum. Experiments performed on sequential patterns demonstrate the interest of this approach and the potential of such techniques.