Discriminative frequent subgraph mining with optimality guarantees

  • Authors:
  • Marisa Thoma;Hong Cheng;Arthur Gretton;Jiawei Han;Hans-Peter Kriegel;Alex Smola;Le Song;Philip S. Yu;Xifeng Yan;Karsten M. Borgwardt

  • Affiliations:
  • Institute for Informatics, Ludwig-Maximilians-Universität München, Munich, Germany;Department of Systems Engineering and Engineering Management, Chinese University of Hong Kong, Hong Kong, China;School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA;University of Illinois at Urbana-Champaign, Urbana, IL, USA;Institute for Informatics, Ludwig-Maximilians-Universität München, Munich, Germany;Yahoo! Research, Santa Clara, CA, USA;School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA;University of Illinois at Chicago, Chicago, IL, USA;Department of Computer Science, University of California, Santa Barbara, CA, USA;Max Planck Institute for Developmental Biology and Max Planck Institute for Biological Cybernetics, Tübingen, Germany

  • Venue:
  • Statistical Analysis and Data Mining
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

The goal of frequent subgraph mining is to detect subgraphs that frequently occur in a dataset of graphs. In classification settings, one is often interested in discovering discriminative frequent subgraphs, whose presence or absence is indicative of the class membership of a graph. In this article, we propose an approach to feature selection on frequent subgraphs, called CORK, that combines two central advantages. First, it optimizes a submodular quality criterion, which means that we can yield a near-optimal solution using greedy feature selection. Second, our submodular quality function criterion can be integrated into gSpan, the state-of-the-art tool for frequent subgraph mining, and help to prune the search space for discriminative frequent subgraphs even during frequent subgraph mining. Copyright © 2010 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 3: 302-318, 2010