InfoMiner+: Mining Partial Periodic Patterns with Gap Penalties

  • Authors:
  • Jiong Yang;Wei Wang;Philip S. Yu

  • Affiliations:
  • -;-;-

  • Venue:
  • ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we focus on mining periodic patterns allowing some degreeof imperfection in the form of random replacement from a perfectperiodic pattern. Information gain was proposed to identify patternswith events of vastly different occurrence frequencies and adjust forthe deviation from a pattern. However, it does not take any penaltyif there exists some gap between the pattern occurrences. In manyapplications, e.g., bio-informatics, it is important to identify subsequencesthat a pattern repeats perfectly (or near perfectly). As a solution,we extend the information gain measure to include a penaltyfor gaps between pattern occurrences. We call this measure as generalizedinformation gain. Furthermore, we want to find subsequenceS' such that for a pattern P , the generalized information gain of Pin S' is high. This is particularly useful in locating repeats in DNAsequences. In this paper, we developed an effective mining algorithm,InfoMiner+, to simultaneously mine significant patterns and the as-sociatedsubsequences.