STOC '00 Proceedings of the thirty-second annual ACM symposium on Theory of computing
A sub-quadratic sequence alignment algorithm for unrestricted cost matrices
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Space-Economical Algorithms for Finding Maximal Unique Matches
CPM '02 Proceedings of the 13th Annual Symposium on Combinatorial Pattern Matching
Opportunistic data structures with applications
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Designing multiple simultaneous seeds for DNA similarity search
RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
On spaced seeds for similarity search
Discrete Applied Mathematics
Estimating Seed Sensitivity on Homogeneous Alignments
BIBE '04 Proceedings of the 4th IEEE Symposium on Bioinformatics and Bioengineering
Designing seeds for similarity search in genomic DNA
Journal of Computer and System Sciences - Special issue on bioinformatics II
Vector seeds: An extension to spaced seeds
Journal of Computer and System Sciences - Special issue on bioinformatics II
CIAA'07 Proceedings of the 12th international conference on Implementation and application of automata
Protein similarity search with subset seeds on a dedicated reconfigurable hardware
PPAM'07 Proceedings of the 7th international conference on Parallel processing and applied mathematics
Hi-index | 0.00 |
Using a seed to rapidly “hit” possible homologies for further examination is a common practice to speed up homology search in molecular sequences. It has been shown that a collection of higher weight seeds have better sensitivity than a single lower weight seed at the same speed. However, huge memory requirements diminish the advantages of high weight seeds. This paper describes a two-stage extension method, which simulates high weight seeds with modest memory requirements. The paper also proposes the use of so-called daughter seeds, which is an extension of the previously studied vector seed idea. Daughter seeds, especially when combined with the two-stage extension, provide the flexibility to maximize the independence between the seeds, which is a well-known criterion for maximizing sensitivity. Some other practical techniques to reduce memory usage are also discussed in the paper.