Relational Sequence Alignments and Logos

  • Authors:
  • Andreas Karwath;Kristian Kersting

  • Affiliations:
  • University of Freiburg, Institute for Computer Science, Machine Learning Lab, Georges-Koehler-Allee, Building 079, 79110 Freiburg, Germany;University of Freiburg, Institute for Computer Science, Machine Learning Lab, Georges-Koehler-Allee, Building 079, 79110 Freiburg, Germany

  • Venue:
  • Inductive Logic Programming
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The need to measure sequence similarity arises in many applicitation domains and often coincides with sequence alignment: the more similar two sequences are, the better they can be aligned. Aligning sequences not only shows how similar sequences are, it also shows where there are differences and correspondences between the sequences.Traditionally, the alignment has been considered for sequences of flat symbols only. Many real world sequences such as natural language sentences and protein secondary structures, however, exhibit rich internal structures. This is akin to the problem of dealing with structured examples studied in the field of inductive logic programming (ILP). In this paper, we introduce Real, which is a powerful, yet simple approach to align sequence of structured symbols using well-established ILP distance measures within traditional alignment methods. Although straight-forward, experiments on protein data and Medline abstracts show that this approach works well in practice, that the resulting alignments can indeed provide more information than flat ones, and that they are meaningful to experts when represented graphically.