Discovery of Frequent Episodes in Event Sequences
Data Mining and Knowledge Discovery
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Discovering Sequential Association Rules with Constraints and Time Lags in Multiple Sequences
ISMIS '02 Proceedings of the 13th International Symposium on Foundations of Intelligent Systems
Sequential Association Rule Mining with Time Lags
Journal of Intelligent Information Systems
Discovery of Serial Episodes from Streams of Events
SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
A fast algorithm for finding frequent episodes in event streams
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
A scalable distributed stream mining system for highway traffic data
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Hi-index | 0.00 |
The problem of discovering episode rulesfrom static databases has been studied for years due to its wide applications in prediction. In this paper, we make the first attempt to study a special episode rule, named serial episode rule with a time lagin an environment of multiple data streams. This rule can be widely used in different applications, such as traffic monitoring over multiple car passing streams in highways. Mining serial episode rules over the data stream environment is a challenge due to the high data arrival rates and the infinite length of the data streams. In this paper, we propose two methods considering different criteria on space utilization and precision to solve the problem by using a prefix tree to summarize the data streams and then traversing the prefix tree to generate the rules. A series of experiments on real data is performed to evaluate the two methods.