Discovery of interesting episodes in sequence data

  • Authors:
  • Ambika Srinivasan;Dhawal Bhatia;Sharma Chakravarthy

  • Affiliations:
  • The University of Texas at Arlington;The University of Texas at Arlington;The University of Texas at Arlington

  • Venue:
  • Proceedings of the 2006 ACM symposium on Applied computing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

There is considerable body of work on sequence mining of transactional data. Most of the related work on point data (not significant intervals) makes several passes over the entire dataset in order to discover frequently occurring (sequential) patterns. But Hybrid apriori, proposed in this paper, as the name implies is an apriori-class of mining algorithm in SQL and takes a different approach. Significant intervals for each event (or device) is computed first and used for detecting frequent event patterns. The advantages of this approach are that the data set is compressed to find significant intervals thereby reducing the size of input used. Also, each event/device is processed individually allowing for parallel computation of individual events. Then the hybrid apriori algorithm works on the significant intervals using an apriori-style algorithm adapted to intervals. Our approach has significant advantages over the traditional mining algorithms in terms of its efficiency, scalability and storage requirements.