Classification of graphical data made easy

  • Authors:
  • Edmondo Trentin;Ernesto Di Iorio

  • Affiliations:
  • Dipartimento di Ingegneria dell'Informazione, Universití di Siena, V. Roma, 56 - Siena, Italy;Dipartimento di Ingegneria dell'Informazione, Universití di Siena, V. Roma, 56 - Siena, Italy

  • Venue:
  • Neurocomputing
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

The classification of graphical patterns (i.e., data that are represented in the form of labeled graphs) is a problem that has been receiving considerable attention by the machine learning community in recent years. Solutions to the problem would be valuable to a number of applications, ranging from bioinformatics and cheminformatics to Web-related tasks, structural pattern recognition for image processing, etc. Several approaches have been proposed so far, e.g. inductive logic programming and kernels for graphs. Connectionist models were introduced too, namely recursive neural nets (RNN) and graph neural nets (GNN). Although their theoretical properties are sound and thoroughly understood, RNNs and GNNs suffer some drawbacks that may limit their application. This paper introduces an alternative connectionist framework for learning discriminant functions over graphical data. The approach is simple and suitable to maximum-a-posteriori classification of broad families of graphs, and overcomes some limitations of RNNs and GNNs. The idea is to describe a graph as an algebraic relation, i.e. as a subset of the Cartesian product. The class-posterior probabilities given the relation are then reduced (under an iid assumption) to products of probabilistic quantities, estimated using a multilayer perceptron. Empirical evidence shows that, in spite of its simplicity, the technique compares favorably with established approaches on several tasks involving different graphical representations of the data. In particular, in the classification of molecules from the Mutagenesis dataset (friendly+unfriendly) the best result to date (93.91%) is obtained.