An Algorithm for Finding a Common Structure Shared by a Family of Strings
IEEE Transactions on Pattern Analysis and Machine Intelligence
Searching for flexible repeated patterns using a non-transitive similarity relation
Pattern Recognition Letters
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Rapid identification of repeated patterns in strings, trees and arrays
STOC '72 Proceedings of the fourth annual ACM symposium on Theory of computing
Hi-index | 0.00 |
Searching repeated words in sequences is a problem treated in several ways. There are two categories of methods for searching word repetition: Methods for searching exact words and methods for searching approximate words. The exact words search methods allow searching words by not tolerating any errors or differences between words. The approximate words search methods consist in finding the words with K differences (or errors) from targets words M. When words represent continuations of letters or characters belonging to an alphabet S, these errors can be presented by substitution, suppression or insertion of characters. Approximate search methods are more usually used in bioinformatic because they offer greater flexibility allowing to find more words. There are several sequences analysis techniques. The most frequently used one consists of comparing the sequences by aligning them. In this paper, we first clearly delimit our work by studying different techniques. Then, we present a new fast and efficient algorithm derived from two former algorithms. KMRCRelat uses the concept of relational words. In fact, we present a word by its components and the relations between them. We only consider components and their successors relations.