On similarity codes

  • Authors:
  • A. G. D'yachkov;D. C. Torney

  • Affiliations:
  • Dept. of Probability Theory, Moscow State Univ.;-

  • Venue:
  • IEEE Transactions on Information Theory
  • Year:
  • 2006

Quantified Score

Hi-index 754.84

Visualization

Abstract

We introduce a biologically motivated measure of sequence similarity for quaternary N-sequences, extending Hamming similarity. This measure is the sum over the length of the sequences of “alphabetic” similarities at all positions. Alphabetic similarities are defined, symmetrically, on the Cartesian square of the alphabet. These similarities equal zero whenever the two elements differ. In distinction to Hamming similarity, however, our alphabetic similarities take individual values whenever the two elements are identical. In this correspondence we derive lower and upper bounds on the rate of the corresponding quaternary nonlinear and linear codes called similarity codes and applied to DNA sequences