WWW '05 Proceedings of the 14th international conference on World Wide Web
Discovering Frequent Episodes and Learning Hidden Markov Models: A Formal Connection
IEEE Transactions on Knowledge and Data Engineering
Journal of the ACM (JACM)
A fast algorithm for finding frequent episodes in event streams
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient mining of frequent episodes from complex sequences
Information Systems
ACM Computing Surveys (CSUR)
Discrete wavelet transform-based time series analysis and mining
ACM Computing Surveys (CSUR)
On-line rule matching for event prediction
The VLDB Journal — The International Journal on Very Large Data Bases
Mining actionable partial orders in collections of sequences
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
Processing count queries over event streams at multiple time granularities
Information Sciences: an International Journal
Mining statistically significant substrings using the chi-square statistic
Proceedings of the VLDB Endowment
A prediction framework based on contextual data to support Mobile Personalized Marketing
Decision Support Systems
Review: A review of novelty detection
Signal Processing
Hi-index | 0.00 |
We present a method for a reliable detection of "unusual" sets of episodes in the form of many pattern sequences, scanned simultaneously for an occurrence as a subsequence in a large event stream within a window of size w. We also investigate the important special case of all permutations of the same sequence, which models the situation where the order of events in an episode does not matter, e.g., when events correspond to purchased market basket items. In order to build a reliable monitoring system we compare obtained measurements to a reference model which in our case is a probabilistic model (Bernoulli or Markov). We first present a precise analysis that leads to a construction of a threshold. The difficulties of carrying out a probabilistic analysis for an arbitrary set of patterns, stems from the possible simultaneous occurrence of many members of the set as subsequences in the same window, the fact that the different patterns typically do have common symbols or common subsequences or possibly common prefixes, and that they may have different lengths. We also report on extensive experimental results, carried out on the Wal-Mart transactions database, that show a remarkable agreement with our theoretical analysis. This paper is an extension of our previous work in [Reliable detection of episodes in event sequences] where we laid out foundation for the problem of the reliable detection of an "unusual" episodes, but did not consider more than one episode scanned simultaneously for an occurrence.