Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
Indexing and Retrieval for Genomic Databases
IEEE Transactions on Knowledge and Data Engineering
Efficient Index Structures for String Databases
Proceedings of the 27th International Conference on Very Large Data Bases
Effective Indexing and Filtering for Similarity Search in Large Biosequence Databases
BIBE '03 Proceedings of the 3rd IEEE Symposium on BioInformatics and BioEngineering
Survey on index based homology search algorithms
The Journal of Supercomputing
An Efficient Two-Phase Algorithm to Find Gene-Specific Probes for Large Genomes
FBIT '07 Proceedings of the 2007 Frontiers in the Convergence of Bioscience and Information Technologies
A Fast Heuristic Algorithm for Similarity Search in Large DNA Databases
FBIT '07 Proceedings of the 2007 Frontiers in the Convergence of Bioscience and Information Technologies
Computational Biology and Chemistry
Indexing methods for approximate dictionary searching: Comparative analysis
Journal of Experimental Algorithmics (JEA)
Hi-index | 0.00 |
Index-based search algorithms are an important part of a genomic search, and how to construct indices is the key to an index-based search algorithm to compute similarities between two DNA sequences. In this paper, we propose an efficient query processing method that uses special transformations to construct an index. It uses small storage and it rapidly finds the similarity between two sequences in a DNA sequence database. At first, a sequence is partitioned into equal length windows. We select the likely subsequences by computing Hamming distance to query sequence. The algorithm then transforms the subsequences in each window into a multidimensional vector space by indexing the frequencies of the characters, including the positional information of the characters in the subsequences. The result of our experiments shows that the algorithm has faster run time than other heuristic algorithms based on index structure. Also, the algorithm is as accurate as those heuristic algorithms.