Randomized fast design of short DNA words

  • Authors:
  • Ming-Yang Kao;Manan Sanghi;Robert Schweller

  • Affiliations:
  • Northwestern University, Evanston, IL;Microsoft, Redmond, WA;University of Texas—Pan American, Edinburg, TX

  • Venue:
  • ACM Transactions on Algorithms (TALG)
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the problem of efficiently designing sets (codes) of equal-length DNA strings (words) that satisfy certain combinatorial constraints. This problem has numerous motivations including DNA self-assembly and DNA computing. Previous work has extended results from coding theory to obtain bounds on code size for new biologically motivated constraints and has applied heuristic local search and genetic algorithm techniques for code design. This article proposes a natural optimization formulation of the DNA code design problem in which the goal is to design n strings that satisfy a given set of constraints while minimizing the length of the strings. For multiple sets of constraints, we provide simple randomized algorithms that run in time polynomial in n and any given constraint parameters, and output strings of length within a constant factor of the optimal with high probability. To the best of our knowledge, this work is the first to consider this type of optimization problem in the context of DNA code design.