A sampling-based method for mining frequent patterns from databases

  • Authors:
  • Yen-Liang Chen;Chin-Yuan Ho

  • Affiliations:
  • Dept. of Information Management, National Central Univ, Chung-Li, Taiwan;Dept. of Information Management, National Central Univ, Chung-Li, Taiwan

  • Venue:
  • FSKD'05 Proceedings of the Second international conference on Fuzzy Systems and Knowledge Discovery - Volume Part II
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Mining frequent item sets (frequent patterns) in transaction databases is a well known problem in data mining research. This work proposes a sampling-based method to find frequent patterns. The proposed method contains three phases. In the first phase, we draw a small sample of data to estimate the set of frequent patterns, denoted as FS. The second phase computes the actual supports of the patterns in FS as well as identifies a subset of patterns in FS that need to be further examined in the next phase. Finally, the third phase explores this set and finds all missing frequent patterns. The empirical results show that our algorithm is efficient, about two or three times faster than the well-known FP-growth algorithm.