What's hot and what's not: tracking most frequent items dynamically
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Group Testing Problems with Sequences in Experimental Molecular Biology
SEQUENCES '97 Proceedings of the Compression and Complexity of Sequences 1997
International Journal of Bioinformatics Research and Applications
Born again group testing: Multiaccess communications
IEEE Transactions on Information Theory
2-stage fault tolerant interval group testing
ISAAC'07 Proceedings of the 18th international conference on Algorithms and computation
Hi-index | 0.04 |
Given a finite ordered set of items and an unknown distinguished subset P of up to p positive elements, identify the items in P by asking the least number of queries of the type ''does the subset Q intersect P?'', where Q is a subset of consecutive elements of {1,2,...,n}. This problem arises, e.g., in computational biology, in a particular method for determining splice sites in genes. We consider time-efficient algorithms where queries are arranged in a fixed number s of stages: In each stage, queries are performed in parallel. In a recent bioinformatics paper, we proved optimality (subject to lower-order terms) with respect to the number of queries, of some strategies for the special cases p=1 or s=2. Exploiting new ideas, we are now able to provide improved lower bounds for any p=2 and s=3 and improved upper bounds for larger s. Most notably, our new bounds converge as s grows. Our new query scheme uses overlapping query intervals within a stage, which is effective for large enough s. This contrasts with our previous results for s==3. Anyway, the remaining gaps between the current upper and lower bounds for any fixed s=3 amount to small constant factors in the main term. The paper ends with a discussion of practical implications in the case that the positive elements are well separated.