Finding top-n colossal patterns based on clique search with dynamic update of graph

  • Authors:
  • Yoshiaki Okubo;Makoto Haraguchi

  • Affiliations:
  • Graduate School of Information Science and Technology, Hokkaido University, Sapporo, Japan;Graduate School of Information Science and Technology, Hokkaido University, Sapporo, Japan

  • Venue:
  • ICFCA'12 Proceedings of the 10th international conference on Formal Concept Analysis
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we discuss a method for finding top-N colossal frequent patterns. A colossal pattern we try to extract is a maximal pattern with top-N largest length. Since colossal patterns can be found in relatively lower areas of an itemset (concept) lattice, an efficient method with some effective pruning mechanisms is desired. We design a depth-first branch-and-bound algorithm for finding colossal patterns with top-N length, where a notion of pattern graph plays an important role. A pattern graph is a compact representation of the class of frequent patterns with a designated length. A colossal pattern can be found as a clique in a pattern graph satisfying a certain condition. From this observation, we design an algorithm for finding our target patterns by examining cliques in a graph defined from the pattern graph. The algorithm is based on a depth-first branch-and-bound method for finding a maximum clique. It should be noted that as our search progresses, the graph we are concerned with is dynamically updated into a sparser one which makes our task of finding cliques much easier and the branch-and-bound pruning more powerful. To the best of our knowledge, it is the first algorithm tailored for the problem which can exactly identify top-N colossal patterns. In our experimentation, we compare our algorithm with famous maximal frequent itemset miners from the viewpoint of computational efficiency for a synthetic and a benchmark dataset.