CP-summary: a concise representation for browsing frequent itemsets

  • Authors:
  • Ardian Kristanto Poernomo;Vivekanand Gopalkrishnan

  • Affiliations:
  • Nanyang Technological University, Singapore, Singapore;Nanyang Technological University, Singapore, Singapore

  • Venue:
  • Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper tackles the problem of summarizing frequent itemsets. We observe that previous notions of summaries cannot be directly used for analyzing frequent itemsets. In order to be used for analysis, one requirement is that the analysts should be able to browse all frequent itemsets by only having the summary. For this purpose, we propose to build the summary based upon a novel formulation, conditional profile (or c-profile). Several features of our proposed summary are: (1) each profile in the summary can be analyzed independently, (2) it provides error guarantee (ε-adequate), and (3) it produces no false positives or false negatives. Having the formulation, the next challenge is to produce the most concise summary which satisfies the requirement. In this paper, we also designed an algorithm which is both effective and efficient for this task. The quality of our approach is justified by extensive experiments. The implementations for the algorithms are available from www.cais.ntu.edu.sg/~vivek/pubs/cprofile09.