Suffix arrays: a new method for on-line string searches
SODA '90 Proceedings of the first annual ACM-SIAM symposium on Discrete algorithms
Rapid Large-Scale Oligonucleotide Selection for Microarrays
CSB '02 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Fast and Accurate Probe Selection Algorithm for Large Genomes
CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
A recursive algorithm for global alignment with gap consideration in a pair of sequences
SEPADS'06 Proceedings of the 5th WSEAS International Conference on Software Engineering, Parallel and Distributed Systems
Hi-index | 0.00 |
We present the first program that selects short oligonucleotide probes (e.g. 25-mers) for microarray experiments on a large scale. Our approach is up to two orders of magnitude faster than previous approaches (e.g. [2], [3]) and is the first one that allows handling truly large-scale datasets. For example, oligos for human genes can be found within 50 hours. This becomes possible by using the longest common substring as a specificity measure for candidate oligos. We present an algorithm based on a suffix array [1] with additional information that is efficient both in terms of memory usage and running time to rank all candidate oligos according to their specificity. We also introduce the concept of master sequences to describe the sequences from which oligos are to be selected. Constraints such as oligo length, melting temperature, and self-complementarity are incorporated in the master sequence at a preprocessing stage and thus kept separate from the main selection problem. As a result, custom oligos can be designed for any sequenced genome, just as the technology for on-site chip synthesis is becoming increasingly mature. Details will be given in the presentation and can be found in [4].