An efficient algorithm for finding gene-specific probes for DNA microarrays

Authors:
Mun-Ho Choi;In-Seon Jeong;Seung-Ho Kang;Hyeong-Seok Lim
Affiliations:
Dept. of Computer Science, Chonnam National University, Buk-gu, Gwangju, Korea;Dept. of Computer Science, Chonnam National University, Buk-gu, Gwangju, Korea;Dept. of Computer Science, Chonnam National University, Buk-gu, Gwangju, Korea;Dept. of Computer Science, Chonnam National University, Buk-gu, Gwangju, Korea
Venue:
ISBRA'07 Proceedings of the 3rd international conference on Bioinformatics research and applications
Year:
2007

Citing 9
Cited 0

Algorithms for approximate string matching

Information and Control
A fast bit-vector algorithm for approximate string matching based on dynamic programming

Journal of the ACM (JACM)
A guided tour to approximate string matching

ACM Computing Surveys (CSUR)
Approximate String Matching in DNA Sequences

DASFAA '03 Proceedings of the Eighth International Conference on Database Systems for Advanced Applications
Rapid Large-Scale Oligonucleotide Selection for Microarrays

CSB '02 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Fast and Accurate Probe Selection Algorithm for Large Genomes

CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Fast and Sensitive Probe Selection for DNA Chips Using Jumps in Matching Statistics

CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
YODA: selecting signature oligonucleotides

Bioinformatics
Increased bit-parallelism for approximate and multiple string matching

Journal of Experimental Algorithmics (JEA)

Quantified Score

Hi-index	0.00

Visualization

Abstract

The accuracy of a DNA microarray is fairly dependent on the quality of the probes it uses; a good probe should be specific for exactly one gene. Most sequence based algorithms use the edit distance to the target sequences as the measure of the specificity of the probe. We propose a novel algorithm for finding gene-specific probes which avoids large amounts of redundant computations of the edit distance, while maintaining the same accuracy as that provided by an exhaustive search. Our approach utilizes the fact that when the starting position of a probe candidate is moved only a few base pairs, the change in the edit distance to the off-target sequence is limited. The proposed algorithm does not use any index structures and is insensitive to the length of the probes. Our approach enables short (20∼30 bases) or long (50 or more bases) probes to be computed for genomes of size 10M within a day.