Optimized Disjunctive Association Rules via Sampling

  • Authors:
  • J. Elble;C. Heeren;L. Pitt

  • Affiliations:
  • -;-;-

  • Venue:
  • ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

The problem of finding optimized support associationrules for a single numerical attribute, where the optimizedregion is a union of k disjoint intervals from the range ofthe attribute, is investigated. The first polynomial timealgorithm for the problem of finding such a region maximizingsupport and meeting a minimum cumulative confidencethreshold is given. Because the algorithm is notpractical, an ostensibly easier, more constrained versionof the problem is considered. Experiments demonstratethat the best extant algorithm for the constrained versionhas significant performance degradation on both a syntheticmodel of patterned data and on real world data sets.Running the algorithm on a small random sample is proposedas a means of obtaining near optimal results withhigh probability. Theoretical bounds on sufficient samplesize to achieve a given performance level are proved, andrapid convergence on synthetic and real-world data is validatedexperimentally.