Experimental results on string matching algorithms
Software—Practice & Experience
K-M-P string matching revisited
Information Processing Letters
Estimating Seed Sensitivity on Homogeneous Alignments
BIBE '04 Proceedings of the 4th IEEE Symposium on Bioinformatics and Bioengineering
Computer science and bioinformatics
Communications of the ACM - The disappearing computer
Good spaced seeds for homology search
Bioinformatics
Biological sequence alignment on the computational grid using the GrADS framework
Future Generation Computer Systems - Special section: Complex problem-solving environments for grid computing
An adaptive grid implementation of DNA sequence alignment
Future Generation Computer Systems
Mining sequential patterns by PrefixSpan algorithm with approximation
ACS'08 Proceedings of the 8th conference on Applied computer scince
Hi-index | 0.00 |
DNA sequence alignment for similarity search is a vital topic in bioinformatics algorithm development. Computational searching for a set of DNA sequences, S, that similar to a query sequence, q, in a large scale of DNA databases is very complicated and requires high processors performance as well as large memory spaces. Frequently, quadratic running time complexity dynamic programming algorithms used to produce a local optimal sequence alignment. However, this algorithm is time consuming in dealing with a long DNA sequences. By means of local alignment, this paper presents a framework to search a set of similar sequences in a large scale of DNA databases with reliable output and minimum cost. The Knuth-Morris-Pratt algorithm (KMP) is adapted and acts as a filtering mechanism before exhaustive dynamic programming is applied. The KMP algorithm is used to scan the generated patterns from query sequence to the sequences in databases. This filtering process generates scores which are used for ranking purposes. The Smith-Waterman algorithm then is applied to each sequences starting from the top of the constructed ranking. The paper also discusses the optimal patterns length that highly appropriate for the database scanning process. The experiment results show that the filtering mechanism proposes discard irrelevant sequences. Therefore, the time for searching and retrieving the set of similar sequences from databases to the query is minimized.