Overlaps help: Improved bounds for group testing with interval queries

  • Authors:
  • Ferdinando Cicalese;Peter Damaschke;Libertad Tansini;Sören Werth

  • Affiliations:
  • Institut für Bioinformatik, Centrum für Biotechnologie (CeBiTec), Universität Bielefeld, 33594 Bielefeld, Germany;Department of Computer Science and Engineering, Chalmers University, 41296 Göteborg, Sweden;Department of Computer Science and Engineering, Chalmers University, 41296 Göteborg, Sweden;Institut für Informatik und Praktische Mathematik, Christian-Albrechts-Universität zu Kiel, 24118 Kiel, Germany

  • Venue:
  • Discrete Applied Mathematics
  • Year:
  • 2007

Quantified Score

Hi-index 0.04

Visualization

Abstract

Given a finite ordered set of items and an unknown distinguished subset P of up to p positive elements, identify the items in P by asking the least number of queries of the type ''does the subset Q intersect P?'', where Q is a subset of consecutive elements of {1,2,...,n}. This problem arises, e.g., in computational biology, in a particular method for determining splice sites in genes. We consider time-efficient algorithms where queries are arranged in a fixed number s of stages: In each stage, queries are performed in parallel. In a recent bioinformatics paper, we proved optimality (subject to lower-order terms) with respect to the number of queries, of some strategies for the special cases p=1 or s=2. Exploiting new ideas, we are now able to provide improved lower bounds for any p=2 and s=3 and improved upper bounds for larger s. Most notably, our new bounds converge as s grows. Our new query scheme uses overlapping query intervals within a stage, which is effective for large enough s. This contrasts with our previous results for s==3. Anyway, the remaining gaps between the current upper and lower bounds for any fixed s=3 amount to small constant factors in the main term. The paper ends with a discussion of practical implications in the case that the positive elements are well separated.