A multiobjective evolutionary programming framework for graph-based data mining

  • Authors:
  • Prakash Shelokar;Arnaud Quirin;íScar CordóN

  • Affiliations:
  • European Centre for Soft Computing, 33600-Mieres, Spain;European Centre for Soft Computing, 33600-Mieres, Spain;European Centre for Soft Computing, 33600-Mieres, Spain and Department of Computer Science and Artificial Intelligence (DECSAI) and Research Centre on Information and Communication Technologies (C ...

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2013

Quantified Score

Hi-index 0.07

Visualization

Abstract

Subgraph mining is the process of identifying concepts describing interesting and repetitive subgraphs within graph-based data. The exponential number of possible subgraphs makes the problem very challenging. Existing methods apply a single-objective subgraph search with the view that interesting subgraphs are those capable of not merely compressing the data, but also enhancing the interpretation of the data considerably. Usually the methods operate by posing simple constraints (or user-defined thresholds) such as returning all subgraphs whose frequency is above a specified threshold. Such search approach may often return either a large number of solutions in the case of a weakly defined objective or very few in the case of a very strictly defined objective. In this paper, we propose a framework based on multiobjective evolutionary programming to mine subgraphs by jointly maximizing two objectives, support and size of the extracted subgraphs. The proposed methodology is able to discover a nondominated set of interesting subgraphs subject to tradeoff between the two objectives, which otherwise would not be achieved by the single-objective search. Besides, it can use different specific multiobjective evolutionary programming methods. Experimental results obtained by three of the latter methods on synthetically generated as well as real-life graph-based datasets validate the utility of the proposed methodology when benchmarked against classical single-objective methods and their previous, nonevolutionary multiobjective extensions.