Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Frequency Estimation of Internet Packet Streams with Limited Space
ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
A simple algorithm for finding frequent elements in streams and bags
ACM Transactions on Database Systems (TODS)
Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding Window
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
2005 Special Issue: Efficient streaming text clustering
Neural Networks - 2005 Special issue: IJCNN 2005
A simpler and more efficient deterministic scheme for finding frequent items over sliding windows
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
An integrated efficient solution for computing frequent and top-k elements in data streams
ACM Transactions on Database Systems (TODS)
DSTree: A Tree Structure for the Mining of Frequent Sets from Data Streams
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
GraphScope: parameter-free mining of large time-evolving graphs
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
A framework for clustering evolving data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Seeking stable clusters in the blogosphere
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Frequency Estimation over Sliding Windows
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Dynamic pattern mining: an incremental data clustering approach
Journal on Data Semantics II
Hi-index | 0.00 |
We present a generic framework to evaluate patterns obtained from transactional web data streams whose underlying distribution changes with time. The evolving nature of the data makes it very difficult to determine whether there is structure in the data stream, and whether this structure is being learned. This challenge arises in applications such as mining online store transactions, summarizing dynamic document collections, and profiling web traffic. We propose to evaluate this hard instance of unsupervised learning using a continuous assessment of the predictive power of the learned patterns, with specific examples that borrow concepts from supervised learning. We present results from experiments with synthetic data, the 20 Newsgroups dataset, web clickstream data, and a custom collection of RSS News feeds.