Random sampling with a reservoir
ACM Transactions on Mathematical Software (TOMS)
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Sampling from a moving window over streaming data
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Extracting Share Frequent Itemsets with Infrequent Subsets
Data Mining and Knowledge Discovery
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Sampling Large Databases for Association Rules
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Random Sampling from Database Files: A Survey
Proceedings of the 5th International Conference SSDBM on Statistical and Scientific Database Management
Maintaining time-decaying stream aggregates
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Evaluation of sampling for data mining of association rules
RIDE '97 Proceedings of the 7th International Workshop on Research Issues in Data Engineering (RIDE '97) High Performance Database Management for Large-Scale Applications
Efficient Progressive Sampling for Association Rules
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Online Data Mining for Co-Evolving Time Sequences
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Mining Association Rules with Weighted Items
IDEAS '98 Proceedings of the 1998 International Symposium on Database Engineering & Applications
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Finding recent frequent itemsets adaptively over online data streams
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Weighted Association Rule Mining using weighted support and significance framework
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding Window
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Sampling algorithms in a stream operator
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Research issues in data stream association rule mining
ACM SIGMOD Record
Efficient Reservoir Sampling for Transactional Data Streams
ICDMW '06 Proceedings of the Sixth IEEE International Conference on Data Mining - Workshops
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Detecting change in data streams
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
A two-phase algorithm for fast discovery of high utility itemsets
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
A survey on concept drift adaptation
ACM Computing Surveys (CSUR)
Hi-index | 0.00 |
A pattern is considered useful if it can be used to help a person to achieve his goal. Mining data streams for useful patterns is important in many applications. However, data stream can change their behavior over time and, when significant change occurs, much harm is done to the mining result if it is not properly handled. In the past, there have been many studies mainly on adapting to changes in data streams.We contend that adapting to changes is simply not enough. The ability to detect and characterize change is also essential in many applications, for example intrusion detection, network traffic analysis, data streams from intensive care units etc. Detecting changes is nontrivial. In this paper, an online algorithm for change detection in utility mining is proposed. In order to provide a mechanism for making quantitative description of the detected change, we adopt the statistical test.We believe there is the opportunity for an immensely rewarding synergy between data mining and statistic. Different statistical significance tests are evaluated and our study shows that the Chi-square test is the most suitable for enumerated or count data (as is the case for high utility itemsets). We demonstrate the effectiveness of the proposed method by testing it on IBM QUEST market-basket data.