Reliable Detection of Episodes in Event Sequences

  • Authors:
  • Robert Gwadera;Mikhail Atallah;Wojciech Szpankowski

  • Affiliations:
  • -;-;-

  • Venue:
  • ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Suppose one wants to detect "bad" or "suspicious" subsequencesin event sequences.Whether an observed patternof activity (in the form of a particular subsequence) is significantand should be a cause for alarm, depends on howlikely it is to occur fortuitously.A long enough sequenceof observed events will almost certainly contain any subsequence,and setting thresholds for alarm is an important issuein a monitoring system that seeks to avoid false alarms.Suppose a long sequence T of observed events contains asuspicious subsequence pattern S within it, where the suspicioussubsequence S consists of m events and spans a windowof size w within T.We address the fundamental problem:is a certain number of occurrences of a particular subsequenceunlikely to be fortuitous (i.e., indicative of suspiciousactivity)?If the probability of fortuitous occurrencesis high and an automated monitoring system flags it as suspiciousanyway, then such a system will suffer from generatingtoo many false alarms.This paper quantifies the probabilityof such an S occuring in T within a window of sizew, the number of distinct windows containing S as a subsequence,the expected number of such occurrences, its variance,and establishes its limiting distribution that allows toset up an alarm threshold so that the probability of falsealarms is very small.We report on experiments confirmingthe theory and showing that we can detect bad subsequenceswith low false alarm rate.