A Hash Trie Filter Approach to Approximate String Matching for Genomic Databases

Authors:
Ye-In Chang;Jiun-Rung Chen;Min-Tze Hsu
Affiliations:
Dept. of Computer Science and Engineering, National Sun Yat-Sen University, Kaohsiung, Taiwan;Dept. of Computer Science and Engineering, National Sun Yat-Sen University, Kaohsiung, Taiwan;Dept. of Computer Science and Engineering, National Sun Yat-Sen University, Kaohsiung, Taiwan
Venue:
IEA/AIE '09 Proceedings of the 22nd International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems: Next-Generation Applied Intelligence
Year:
2009

Citing 2
Cited 0

On Using q-Gram Locations in Approximate String Matching

ESA '95 Proceedings of the Third Annual European Symposium on Algorithms
Approximate string matching with ordered q-grams

Nordic Journal of Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

For genomic databases, approximate string matching with k errors is often considered for genomic sequences, where k errors could be caused by substitution, insertion, or deletion operations. In this paper, we propose a new approximate string matching method, the hash trie filter , for efficiently searching in genomic databases. Our method not only reduces the number of candidates by pruning some unreasonable matched positions, but also dynamically decides the number of ordered matched grams of one candidate, which results in the increase of precision. The experiment results show that the hash trie filter outperforms the well-known (k + s ) q -samples filter in terms of the response time and the precision, under different lengths of the query patterns and different error levels.