q-gram based database searching using a suffix array (QUASAR)
RECOMB '99 Proceedings of the third annual international conference on Computational molecular biology
Sensitivity analysis and efficient method for identifying optimal spaced seeds
Journal of Computer and System Sciences
Efficient Methods for Generating Optimal Single and Multiple Spaced Seeds
BIBE '04 Proceedings of the 4th IEEE Symposium on Bioinformatics and Bioengineering
Optimizing Multiple Seeds for Protein Homology Search
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Good spaced seeds for homology search
Bioinformatics
Superiority and complexity of the spaced seeds
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Optimal spaced seeds for hidden Markov models, with application to homologous coding regions
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Hi-index | 0.00 |
The best known algorithm computes the sensitivity of a given spaced seed on a random region with running time O((M+L)|B|), where M is the length of the seed, L is the length of the random region, and |B| is the size of seed-compatible-suffix set, which is exponential to the number of 0's in the seed. We developed two algorithms to improve this running time: the first one improves the running time to O(|B′|2ML), where B′ is a subset of B; the second one improves the running time to O((M|B|)2.236log(L/M)), which will be much smaller than the original running time when L is large. We also developed a Monte Carlo algorithm which can guarantee to quickly find a near optimal seed with high probability.