On enumerating frequent closed patterns with key in multi-relational data

  • Authors:
  • Hirohisa Seki;Yuya Honda;Shinya Nagano

  • Affiliations:
  • Nagoya Inst. of Technology, Nagoya, Japan;Nagoya Inst. of Technology, Nagoya, Japan;Nagoya Inst. of Technology, Nagoya, Japan

  • Venue:
  • DS'10 Proceedings of the 13th international conference on Discovery science
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We study the problem of mining closed patterns in multirelational databases. Garriga et al. (IJCAI'07) proposed an algorithm RelLCM2 for mining closed patterns (i.e., conjunctions of literals) in multi-relational data, which is an extension of LCM, an efficient enumeration algorithm for frequent closed item-sets mining proposed in the seminal paper by Uno et al. (DS'04). We assume that a database considered contains a special predicate called key (or target), which determines the entities of interest and what is to be counted. We introduce a notion of closed patterns with key (key-closedness for short), where variables in a pattern other than the one in a key predicate are considered to be existentially quantified, and they are linked to a given target object. We then define a closure operation (key-closure) for computing key-closed patterns, and show that the difference between the semantics of key-closed patterns and that of the closed patterns in RelLCM2 implies different properties of the closure operations; in particular, the uniqueness of closure does not hold for key-closure. Nevertheless, we show that we can enumerate key-closed patterns using the technique of ppc-extensions à la LCM, thereby making the enumeration possible without storage space for previously generated patterns. We also propose a literal order designed for mining key-closed patterns, which will require less search space. The correctness of our algorithm is shown, and its computational complexity is discussed. Some preliminary experimental results are also given.