Fast and Sensitive Probe Selection for DNA Chips Using Jumps in Matching Statistics

  • Authors:
  • Sven Rahmann

  • Affiliations:
  • -

  • Venue:
  • CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

The design of large scale DNA microarrays is a challengingproblem. So far, probe selection algorithms must tradethe ability to cope with large scale problems for a loss ofaccuracy in the estimation of probe quality. We present anapproach based on jumps in matching statistics that combinesthe best of both worlds.This article consists of two parts. The first part is theoretical.We introduce the notion of jumps in matchingstatistics between two strings and derive their properties.We estimate the frequency of jumps for random strings ina non-uniform Bernoulli model and present a new heuristicargument to find the center of the length distribution of thelongest substring that two random strings have in common.The results are generalized to near-perfect matches with asmall number of mismatches.In the second part, we use the concept of jumps to improvethe accuracy of the longest common factor approachfor probe selection by moving from a string-based to anenergy-based specificity measure, while only slightly morethan doubling the selection time.