Frequent pattern mining with uncertain data

  • Authors:
  • Charu C. Aggarwal;Yan Li;Jianyong Wang;Jing Wang

  • Affiliations:
  • IBM T. J. Watson Research Ctr, Hawthorne, NY, USA;Tsinghua University, Beijing, China;Tsinghua University, Beijing, China;New York University, New York, NY, USA

  • Venue:
  • Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper studies the problem of frequent pattern mining with uncertain data. We will show how broad classes of algorithms can be extended to the uncertain data setting. In particular, we will study candidate generate-and-test algorithms, hyper-structure algorithms and pattern growth based algorithms. One of our insightful observations is that the experimental behavior of different classes of algorithms is very different in the uncertain case as compared to the deterministic case. In particular, the hyper-structure and the candidate generate-and-test algorithms perform much better than tree-based algorithms. This counter-intuitive behavior is an important observation from the perspective of algorithm design of the uncertain variation of the problem. We will test the approach on a number of real and synthetic data sets, and show the effectiveness of two of our approaches over competitive techniques.