An incremental algorithm for efficient unique signature discoveries on DNA databases

Authors:
Hsiao Ping Lee;Tzu-Fang Sheu;Chuan Yi Tang
Affiliations:
Chung Shan Medical University, Taichung, Taiwan, R.O.C. and National Tsing Hua University, Hsinchu, Taiwan, R.O.C.;Providence University, Taichung, Taiwan, R.O.C.;National Tsing Hua University, Hsinchu, Taiwan, R.O.C.
Venue:
Proceedings of the 2010 ACM Symposium on Applied Computing
Year:
2010

Citing 7
Cited 0

Suffix arrays: a new method for on-line string searches

SIAM Journal on Computing
Efficient discovery of unique signatures on whole-genome EST databases

Proceedings of the 2005 ACM symposium on Applied computing
Efficient selection of unique and popular oligos for large EST databases†A preliminary version of this work was presented at the Symposium on Combinatorial Pattern Matching, Morelia, Mexico, and included in its Proceedings, pp. 273--283, LNCS 2676, Springer (2003).

Bioinformatics
Picky: oligo microarray design for large genomes

Bioinformatics
YODA: selecting signature oligonucleotides

Bioinformatics
A New Scheme for Nucleotide Sequence Signature Extraction

ICMLA '06 Proceedings of the 5th International Conference on Machine Learning and Applications
hybseek: Pathogen primer design tool for diagnostic multi-analyte assays

Computer Methods and Programs in Biomedicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

DNA signatures are distinct short nucleotide sequences that can be used to detect the presence of certain organisms and to distinguish that organisms from all other species. The signatures provide valuable information for many applications, such as PCR primer designs and microarray experiments. In practice, we use a discovery algorithm to discover unique signatures from DNA databases, and then apply the signatures to microarray experiments. If the discovered result is not satisfying, we will change the parameter settings of the algorithm to get a new result. The process of changing parameter settings may be consecutively performed until a satisfying result is obtained, which is called consequtively multiple discoveries. The situation occurs frequently especially when we handle unfamiliar DNA databases. The challenge is how to accomplish every new discoveries efficiently. The needs of consequtively multiple discoveries are not considered in existing discovery algorithms. In this paper, we propose an incremental algorithm specifically for consecutively multiple discoveries. The algorithm is designed based on the observations on the properties of the signatures. Our algorithm finds out the new result by employing the previously discovered results as candidates rather than performing complete discoveries on the whole database. Since the candidates in the incremental discovery are reduced and limited to the discovered signatures, the discovery process accelerates. Compared with the typical discovery algorithms that perform complete discoveries on a whole database, our incremental algorithm saves at most 87% of the execution time in our experiments.