An incremental algorithm for efficient unique signature discoveries on DNA databases

  • Authors:
  • Hsiao Ping Lee;Tzu-Fang Sheu;Chuan Yi Tang

  • Affiliations:
  • Chung Shan Medical University, Taichung, Taiwan, R.O.C. and National Tsing Hua University, Hsinchu, Taiwan, R.O.C.;Providence University, Taichung, Taiwan, R.O.C.;National Tsing Hua University, Hsinchu, Taiwan, R.O.C.

  • Venue:
  • Proceedings of the 2010 ACM Symposium on Applied Computing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

DNA signatures are distinct short nucleotide sequences that can be used to detect the presence of certain organisms and to distinguish that organisms from all other species. The signatures provide valuable information for many applications, such as PCR primer designs and microarray experiments. In practice, we use a discovery algorithm to discover unique signatures from DNA databases, and then apply the signatures to microarray experiments. If the discovered result is not satisfying, we will change the parameter settings of the algorithm to get a new result. The process of changing parameter settings may be consecutively performed until a satisfying result is obtained, which is called consequtively multiple discoveries. The situation occurs frequently especially when we handle unfamiliar DNA databases. The challenge is how to accomplish every new discoveries efficiently. The needs of consequtively multiple discoveries are not considered in existing discovery algorithms. In this paper, we propose an incremental algorithm specifically for consecutively multiple discoveries. The algorithm is designed based on the observations on the properties of the signatures. Our algorithm finds out the new result by employing the previously discovered results as candidates rather than performing complete discoveries on the whole database. Since the candidates in the incremental discovery are reduced and limited to the discovered signatures, the discovery process accelerates. Compared with the typical discovery algorithms that perform complete discoveries on a whole database, our incremental algorithm saves at most 87% of the execution time in our experiments.