An analysis of the redundancy of graph invariants used in chemoinformatics

  • Authors:
  • Boris Hollas

  • Affiliations:
  • Theoretische Informatik, Universität Ulm, Ulm, Germany

  • Venue:
  • Discrete Applied Mathematics
  • Year:
  • 2006

Quantified Score

Hi-index 0.05

Visualization

Abstract

Molecular descriptors play a decisive role for evaluating large virtual libraries and to predict biological or physicochemical properties of compounds. Topological indices are an important class of molecular descriptors, based on the graph of a molecule. A major problem is that many topological indices are considerably correlated, impeding data analysis and interpretation. Also, a size-dependent variance of topological indices adversely affects data processing by neural nets. Using random graphs as a model for molecules, we examine correlations and variance of an abstract topological index with independent vertex properties. We consider a random graph model making no assumptions on the distribution of graphs and a model on a fixed number of vertices in which edges are selected independently. We show that topological indices may be strongly correlated even for independent vertex properties. On the other hand, uncorrelated topological indices and indices with constant or Θ(1) variance can easily be obtained within the respective random graph models.