Seed optimization for i.i.d. similarities is no easier than optimal Golomb ruler design

Authors:
Bin Ma;Hongyi Yao
Affiliations:
David Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, N2L 3G1, Canada;Institute for Theoretical Computer Science, Tsinghua University, Beijing, 100084, China
Venue:
Information Processing Letters
Year:
2009

Citing 13
Cited 0

Designing seeds for similarity search in genomic DNA

RECOMB '03 Proceedings of the seventh annual international conference on Research in computational molecular biology
Better Filtering with Gapped q-Grams

CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
Sensitivity analysis and efficient method for identifying optimal spaced seeds

Journal of Computer and System Sciences
On spaced seeds for similarity search

Discrete Applied Mathematics
Efficient Methods for Generating Optimal Single and Multiple Spaced Seeds

BIBE '04 Proceedings of the 4th IEEE Symposium on Bioinformatics and Bioengineering
Multiseed Lossless Filtration

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Good spaced seeds for homology search

Bioinformatics
Superiority and complexity of the spaced seeds

SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
On the complexity of the spaced seeds

Journal of Computer and System Sciences
Optimal spaced seeds for faster approximate string matching

Journal of Computer and System Sciences
Hardness of optimal spaced seed design

CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
A class of binary recurrent codes with limited error propagation

IEEE Transactions on Information Theory
Fast computation of good multiple spaced seeds

WABI'07 Proceedings of the 7th international conference on Algorithms in Bioinformatics

Quantified Score

Hi-index	0.89

Visualization

Abstract

The spaced seed is a filtration method to efficiently identify the regions of interest in string similarity searches. It is important to find the optimal spaced seed that achieves the highest search sensitivity. For some simple distributions of the similarities, the seed optimization problem was proved to be not NP-hard. On the other hand, no polynomial time algorithm has been found despite the extensive researches in the literature. In this article we examine the hardness of the seed optimization problem by a polynomial time reduction from the optimal Golomb ruler design problem, which is a well-known difficult (but not NP-hard) problem in combinatorial design.