Comparison of Symbol Sequences: No Editing, No Alignment

  • Authors:
  • M. G. Sadovsky

  • Affiliations:
  • Institute of Biophysics, Siberian Division of Russian Academy of Sciences, Akademgorodok, Krasnoyarsk, 660036

  • Venue:
  • Open Systems & Information Dynamics
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

The new method to compare two (or several) symbol sequences is developed. The method is based on the comparison of the frequencies of the small fragments of the compared sequences; it requires no string editing or other transformations of the compared objects. The comparison is provided through a calculation of the specific entropy of a frequency dictionary against the special dictionary called the hybrid one; the latter is the statistical ancestor of the group of sequences to be compared. Some applications of the method developed to genetics, bioinformatics, and linguistics are discussed.