Mining Mutually Dependent Patterns

  • Authors:
  • Sheng Ma;Joseph L. Hellerstein

  • Affiliations:
  • -;-

  • Venue:
  • ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

In some domains, such as isolating problems in computer net-worksand discovering stock market irregularities, there is more interest inpatterns consisting of infrequent, but highly correlated items rather thanpatterns that occur frequently (as defined by minsup, the minimum supportlevel). Herein, we describe the m-pattern, a new pattern that is definedin terms of minp, the minimum probability of mutual dependence of itemsin the pattern. We show that all infrequent m-pattern can be discovered byan efficient algorithm that makes use of: (a) a linear algorithm to qualifyan m-pattern; (b) an effective technique for candidate pruning based on anecessary condition for the presence of an m-pattern; and (c) a level-wisesearch for m-pattern discovery (which is possible because m-patterns aredownward closed). Further, we consider frequent m-patterns, which aredefined in terms of both minp and minsup. Using synthetic data, we studythe scalability of our algorithm. Then, we apply our algorithm to data froma production computer network both to show the m-patterns present andto contrast with frequent patterns. We show that when minp_0, our algorithmis equivalent to finding frequent patterns. However, with a larger minp, our algorithm yields a modest number of highly correlated items, which makes it possible to mine for infrequent but highly correlated item-sets. To date, many actionable m-patterns have been discovered in production systems.