Faster reaction mapping through improved naming techniques

  • Authors:
  • Tina M. Kouri;Dinesh P. Mehta

  • Affiliations:
  • University of South Florida;Colorado School of Mines

  • Venue:
  • Journal of Experimental Algorithmics (JEA)
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automated reaction mapping is an important tool in cheminformatics where it may be used to classify reactions or validate reaction mechanisms. The reaction mapping problem is known to be NP-Hard and may be formulated as an optimization problem. In this article, we present four algorithms that continue to obtain optimal solutions to this problem, but with significantly improved runtimes over the previous Constructive Count Vector (CCV) algorithm. Our algorithmic improvements include (i) the use of a fast (but not 100% accurate) canonical labeling algorithm, (ii) name reuse (i.e., storing intermediate results rather than recomputing), and (iii) an incremental approach to canonical name computation. The time to map the reactions from the Kegg/Ligand database previously took over 2 days using CCV, but now it takes fewer than 4 hours to complete. Experimental results on chemical reaction databases demonstrate our 2-CCV FDN MS algorithm usually performs over fifteen times faster than previous automated reaction mapping algorithms.