SCA: phonetic alignment based on sound classes

  • Authors:
  • Johann-Mattis List

  • Affiliations:
  • Heinrich Heine University Düsseldorf, Germany

  • Venue:
  • ESSLLI'10 Proceedings of the 2010 international conference on New Directions in Logic, Language and Computation
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper I present the most recent version of the SCA method for pairwise and multiple alignment analyses. In contrast to previously proposed alignment methods, SCA is based on a novel framework of sequence alignment which combines new approaches to sequence modeling in historical linguistics with recent developments in computational biology. In contrast to earlier versions of SCA [1,2] the new version comes along with a couple of modifications that significantly improve the performance and the application range of the algorithm: A new sound class model was defined which works well on highly divergent sequences, the algorithm for pairwise alignment was modified to be sensitive to secondary sequence structures such as syllable boundaries, and an algorithm for the pre-processing of the data in multiple alignment analyses [3] was included to cope for the bias resulting from progressive alignment analyses. In order to test the method, a new gold standard for pairwise and multiple alignment analyses was created which consists of 45 947 sequences covering a total of 435 different taxa belonging to six different language families.