Sequence alignment with arbitrary steps and further generalizations, with applications to alignments in linguistics

  • Authors:
  • Steffen Eger

  • Affiliations:
  • -

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2013

Quantified Score

Hi-index 0.07

Visualization

Abstract

We provide simple generalizations of the classical Needleman-Wunsch algorithm for aligning two sequences. First, we let both sequences be defined over arbitrary, potentially different alphabets. Secondly, we consider similarity functions between elements of both sequences with ranges in a semiring. Thirdly, instead of considering only 'match', 'mismatch' and 'skip' operations, we allow arbitrary non-negative alignment 'steps'S. Next, we present novel combinatorial formulas for the number of monotone alignments between two sequences for selected steps S. Finally, we illustrate sample applications in natural language processing that require larger steps than available in the original Needleman-Wunsch sequence alignment procedure such that our generalizations can be fruitfully adopted.