A sub-quadratic sequence alignment algorithm for unrestricted cost matrices
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Sparse LCS common substring alignment
Information Processing Letters
Finding approximate tandem repeats in genomic sequences
RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
FireμSat: meeting the challenge of detecting microsatellites in DNA
SAICSIT '06 Proceedings of the 2006 annual research conference of the South African institute of computer scientists and information technologists on IT research in developing countries
Sequence Alignment Algorithms for Run-Length-Encoded Strings
COCOON '08 Proceedings of the 14th annual international conference on Computing and Combinatorics
Detection of tandem repeats in DNA sequences based on parametric spectral estimation
IEEE Transactions on Information Technology in Biomedicine - Special section on computational intelligence in medical systems
Sparse LCS common substring alignment
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
An efficient algorithm for finding long conserved regions between genes
CompLife'06 Proceedings of the Second international conference on Computational Life Sciences
On the complexity of sparse exon assembly
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
Hi-index | 0.00 |
In this paper, we present an $O(N^2 \log^2 \,N)$ algorithm for finding the two nonoverlapping substrings of a given string of length $N$ which have the highest-scoring alignment between them. This significantly improves the previously best-known bound of $O(N^3 )$ for the worst-case complexity of this problem. One of the central ideas in the design of this algorithm is that of partitioning a matrix into pieces in such a way that all submatrices of interest for this problem can be put together as the union of very few of these pieces. Other ideas include the use of candidate lists, an application of the ideas of Apostolico et al. [SIAM J. Comput., 19 (1990), pp. 968--988] to our problem domain, and divide-and-conquer techniques.