A representation to apply usual data mining techniques to chemical reactions

  • Authors:
  • Frank Hoonakker;Nicolas Lachiche;Alexandre Varnek;Alain Wagner

  • Affiliations:
  • Chemoinformatics laboratory, University of Strasbourg, France and eNovalys, Illkirch, France;LSIIT, University of Strasbourg, France;eNovalys, Illkirch, France;eNovalys, Illkirch, France and Functional ChemoSystems, University of Strasbourg, France

  • Venue:
  • IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part II
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Chemical reactions always involve several molecules of two types, reactants and products. Existing datamining techniques, eg. Quantitative Structure Activity Relationship (QSAR)methods, deal with individual molecules only. In this article, we propose to use Condensed Graph of Reaction (CGR) approach merging all molecules involved in a reaction into one molecular graph. This allows one to consider reactions as pseudomolecules and to develop QSAR models based on fragment descriptors. Here ISIDA fragment descriptors calculated from CGRs have been used to build quantitative models for the rate constant of SN2 reactions in water. Three common attribute-value regression algorithms (linear regression, support vector machine, and regression trees) have been evaluated.