Several remarks on the metric space of genetic codes

  • Authors:
  • David Weisman;Dan A. Simovici

  • Affiliations:
  • Department of Biology, University of Massachusetts Boston, 100 Morrissey Blvd., Boston, Massachusetts 02125, USA.;Department of Computer Science, University of Massachusetts Boston, 100 Morrissey Blvd., Boston, Massachusetts 02125, USA

  • Venue:
  • International Journal of Data Mining and Bioinformatics
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

A genetic code, the mapping from trinucleotide codons to amino acids, can be viewed as a partition on the set of 64 codons. A small set of non-standard genetic codes is known, and these codes can be mathematically compared by their partitions of the codon set. To measure distances between set partitions, this study defines a parameterised family of metric functions that includes Shannon entropy as a special case. Distances were computed for 17 curated genetic codes using four members of the metric function family. With these metric functions, nuclear genetic codes had relatively small inter-code distances, while mitochondrial codes exhibited greater variance. Hierarchical clustering using Ward's algorithm produced a tight grouping of nuclear codes and several distinct clades of mitochondrial codes. This family of functions may be employed in other biological applications involving set partitions, such as analysis of microarray data, gene set enrichment and protein-protein interaction mapping.