A seriate coverage filtration approach for homology search

Authors:
Hsiao Ping Lee;Yin Te Tsai;Chuan Yi Tang
Affiliations:
National Tsing-Hua University, Hsinchu, Taiwan, ROC;Providence University, Shalu, Taiwan, ROC;National Tsing-Hua University, Hsinchu, Taiwan, ROC
Venue:
Proceedings of the 2004 ACM symposium on Applied computing
Year:
2004

Citing 5
Cited 1

q-gram based database searching using a suffix array (QUASAR)

RECOMB '99 Proceedings of the third annual international conference on Computational molecular biology
Indexing and Retrieval for Genomic Databases

IEEE Transactions on Knowledge and Data Engineering
Computing the Threshold for q-Gram Filters

SWAT '02 Proceedings of the 8th Scandinavian Workshop on Algorithm Theory
A Metric Index for Approximate String Matching

LATIN '02 Proceedings of the 5th Latin American Symposium on Theoretical Informatics
Better filtering with gapped q-grams

Fundamenta Informaticae - Special issue on computing patterns in strings

A hash trie filter method for approximate string matching in genomic databases

Applied Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

The homology search within genomic databases is a fundamental and crucial work in biological knowledge discovery. With exponentially increasing size and access of databases, the issues of efficient retrieval become more essential in bioinformatics. Due to the varieties of biological data, similar sequences are not only under some error tolerance, but are also above some seriate coverage level. In this paper, we propose a seriate coverage filtration approach to extract the homologies from the databases efficiently. Our approach performs a lossless filtration and can be implemented as a preprocess of the existing search heuristics. Our method converts a user's requests for error and seriate coverage levels to some thresholds of interest. Accordingly, we transform the work of homology discovery to a variation of the longest increasing subsequence problem, and design an efficient counterpart algorithm. In the performance test, it is found that our approach has an attractive quality of filtration.