Similarity searching in DNA sequences by spectral distortion measures

  • Authors:
  • Tuan D. Pham

  • Affiliations:
  • Bioinformatics Applications Research Centre

  • Venue:
  • ICDM'06 Proceedings of the 6th Industrial Conference on Data Mining conference on Advances in Data Mining: applications in Medicine, Web Mining, Marketing, Image and Signal Mining
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Searching for similarity among biological sequences is an important research area of bioinformatics because it can provide insight into the evolutionary and genetic relationships between species that open doors to new scientific discoveries such as drug design and treament. In this paper, we introduce a novel measure of similarity between two biological sequences without the need of alignment. The method is based on the concept of spectral distortion measures developed for signal processing. The proposed method was tested using a set of six DNA sequences taken from Escherichia coli K-12 and Shigella flexneri, and one random sequence. It was further tested with a complex dataset of 40 DNA sequences taken from the GenBank sequence database. The results obtained from the proposed method are found superior to some existing methods for similarity measure of DNA sequences.