Euler circuits and DNA sequencing by hybridization
Discrete Applied Mathematics - Special volume on combinatorial molecular biology
Handling long targets and errors in sequencing by hybridization
Proceedings of the sixth annual international conference on Computational biology
Hybrid Genetic Algorithm for DNA Sequencing with Errors
Journal of Heuristics
MFCS '94 Proceedings of the 19th International Symposium on Mathematical Foundations of Computer Science 1994
An integer programming approach to DNA sequence assembly
Computational Biology and Chemistry
Hi-index | 5.23 |
Sequencing by hybridization (SBH) is a method for reconstructing a DNA sequence given the set of all subsequences of length k of the target sequence. This set, called the spectrum of the sequence, can be obtained from hybridization with a universal DNA chip. However, the hybridization experiments are error prone, so this leads to the computational problem of reconstructing a sequence from a noisy spectrum. Halperin et al. gave an algorithm for this problem with provable performance in the presence of both false positive and false negative errors. Assuming, for example, that the false positive rate is small, and the probability of false negative is 0.1, the algorithm can reconstruct a random sequence of length O(20.7k) with an arbitrary small probability of failure. In this paper, we give an algorithm that can reconstruct longer sequences: under the assumptions above, our algorithm can reconstruct sequences of length O(20.942k). This bound is almost optimal as the bound for the errorless case is Θ(2k).