An O(n log n) algorithm for finding all repetitions in a string
Journal of Algorithms
An algorithm for finding tandem repeats of unspecified pattern size
RECOMB '98 Proceedings of the second annual international conference on Computational molecular biology
Identifying satellites in nucleic acid sequences
RECOMB '98 Proceedings of the second annual international conference on Computational molecular biology
Normal forms of quasiperiodic strings
Theoretical Computer Science
Computation and Visualization of Degenerate Repeats in Complete Genomes
Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Evolutive Tandem Repeats Using Hamming Distance
MFCS '02 Proceedings of the 27th International Symposium on Mathematical Foundations of Computer Science
An Algorithm for Approximate Tandem Repeats
CPM '93 Proceedings of the 4th Annual Symposium on Combinatorial Pattern Matching
Finding Maximal Repetitions in a Word in Linear Time
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Classification of Tandem Repeats in the Human Genome
International Journal of Knowledge Discovery in Bioinformatics
Classification of Tandem Repeats in the Human Genome
International Journal of Knowledge Discovery in Bioinformatics
Hi-index | 5.23 |
We recently introduced evolutive tandem repeats with jump (using Hamming distance) (Proc. MFCS'02: the 27th Internat. Symp. Mathematical Foundations of Computer Science, Warszawa, Otwock, Poland, August 2002, Lecture Notes in Computer Science, Vol. 2420, Springer, Berlin, pp. 292-304) which consist in a series of almost contiguous copies having the following property: the Hamming distance between two consecutive copies is always smaller than a given parameter e. In this article, we present a significative improvement that speeds up the detection of evolutive tandem repeats. It is based on the progressive computation of distances between candidate copies participating to the evolutive tandem repeat. It leads to a new algorithm, still quadratic in the worst case, but much more efficient on average, authorizing larger sequences to be processed.