Similarity between Event Types in Sequences

  • Authors:
  • Heikki Mannila;Pirjo Moen

  • Affiliations:
  • -;-

  • Venue:
  • DaWaK '99 Proceedings of the First International Conference on Data Warehousing and Knowledge Discovery
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

Similarity or distance between objects is one of the central concepts in data mining. In this paper we consider the following problem: given a set of event sequences, define a useful notion of similarity between the different types of events occurring in the sequences. We approach the problem by considering two event types to be similar if they occur in similar contexts. The context of an occurrence of an event type is defined as the set of types of the events happening within a certain time limit before the occurrence. Then two event types are similar if their sets of contexts are similar. We quantify this by using a simple approach of computing centroids of sets of contexts and using the L1 distance. We present empirical results on telecommunications alarm sequences and student enrollment data, showing that the method produces intuitively appealing results.