Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach
IEEE Transactions on Knowledge and Data Engineering
Classification spanning correlated data streams
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
A semi-random multiple decision-tree algorithm for mining data streams
Journal of Computer Science and Technology
Mining Multidimensional Sequential Patterns over Data Streams
DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
Mining frequent itemsets over data streams using efficient window sliding techniques
Expert Systems with Applications: An International Journal
Establishing relationships among patterns in stock market data
Data & Knowledge Engineering
Mining sequential patterns across multiple sequence databases
Data & Knowledge Engineering
Mining complex patterns across sequences with gap requirements
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Mining closed episodes with simultaneous events
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Incremental Algorithm for Discovering Frequent Subsequences in Multiple Data Streams
International Journal of Data Warehousing and Mining
Incremental mining of sequential patterns: Progress and challenges
Intelligent Data Analysis
Hi-index | 0.00 |
In this paper, we deal with mining sequential patterns in multiple data streams. Building on a state-of-the-art sequential pattern mining algorithm PrefixSpan for mining transaction databases, we propose MILE鹿, an efficient algorithm to facilitate the mining process. MILE recursively utilizes the knowledge of existing patterns to avoid redundant data scanning, and can therefore effectively speed up the new patterns' discovery process. Another unique feature of MILE is that it can incorporate some prior knowledge of the data distribution in data streams into the mining process to further improve the performance. Extensive empirical results show thatMILE is significantly faster than PrefixSpan. As MILE consumes more memory than PrefixSpan, we also present a solution to balance the memory usage and time efficiency in memory constrained environments.