Comparison of Genomic Sequences Clustering Using Normalized Compression Distance and Evolutionary Distance

  • Authors:
  • Massimo Rosa;Riccardo Rizzo;Alfonso Urso;Salvatore Gaglio

  • Affiliations:
  • Dipartimento di Ingegneria Informatica, Universitá di Palermo, Italy and ICAR-CNR, Consiglio Nazionale delle Ricerche, Palermo, Italy;ICAR-CNR, Consiglio Nazionale delle Ricerche, Palermo, Italy;ICAR-CNR, Consiglio Nazionale delle Ricerche, Palermo, Italy;Dipartimento di Ingegneria Informatica, Universitá di Palermo, Italy and ICAR-CNR, Consiglio Nazionale delle Ricerche, Palermo, Italy

  • Venue:
  • KES '08 Proceedings of the 12th international conference on Knowledge-Based Intelligent Information and Engineering Systems, Part III
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Genomic sequences are usually compared using evolutionary distance, a procedure that implies the alignment of the sequences. Alignment of long sequences is a long procedure and the obtained dissimilarity results is not a metric. Recently the normalized compression distance was introduced as a method to calculate the distance between two generic digital objects, and it seems a suitable way to compare genomic strings. In this paper the clustering and the mapping, obtained using a SOM, with the traditional evolutionary distance and the compression distance are compared in order to understand if the two distances sets are similar. The first results indicate that the two distances catch different aspects of the genomic sequences and further investigations are needed to obtain a definitive result.