Multi-way set enumeration in real-valued tensors

  • Authors:
  • Elisabeth Georgii;Koji Tsuda;Bernhard Schölkopf

  • Affiliations:
  • MPI for Biological Cybernetics/Friedrich Miescher Laboratory of the Max Planck Society, Tübingen, Germany;National Institute of Advanced Industrial Science and Technology (AIST), Tokyo, Japan;MPI for Biological Cybernetics, Tübingen, Germany

  • Venue:
  • Proceedings of the 2nd Workshop on Data Mining using Matrices and Tensors
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The analysis of n-ary relations receives attention in many different fields, for instance biology, web mining, and social studies. In the basic setting, there are n sets of instances, and each observation associates n instances, one from each set. A common approach to explore these n-way data is the search for n-set patterns. An n-set pattern consists of specific subsets of the n instance sets such that all possible n- ary associations between the corresponding instances are observed. This provides a higher-level view of the data, revealing associative relationships between groups of instances. Here, we generalize this approach in two respects. First, we tolerate missing observations to a certain degree, that means we are also interested in n-sets where most (although not all) of the possible combinations have been recorded in the data. Second, we take association weights into account. More precisely, we propose a method to enumerate all n- sets that satisfy a minimum threshold with respect to the average association weight. Non-observed associations obtain by default a weight of zero. Technically, we solve the enumeration task using a reverse search strategy, which allows for effective pruning of the search space. In addition, our algorithm provides a ranking of the solutions and can consider further constraints. We show experimental results on artificial and real-world data sets from different domains.