Inducing sound segment differences using Pair Hidden Markov Models

  • Authors:
  • Martijn Wieling;Therese Leinonen;John Nerbonne

  • Affiliations:
  • University of Groningen;University of Groningen;University of Groningen

  • Venue:
  • SigMorPhon '07 Proceedings of Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Pair Hidden Markov Models (PairHMMs) are trained to align the pronunciation transcriptions of a large contemporary collection of Dutch dialect material, the Goeman-Taeldeman-Van Reenen-Project (GTRP, collected 1980--1995). We focus on the question of how to incorporate information about sound segment distances to improve sequence distance measures for use in dialect comparison. PairHMMs induce segment distances via expectation maximisation (EM). Our analysis uses a phonologically comparable subset of 562 items for all 424 localities in the Netherlands. We evaluate the work first via comparison to analyses obtained using the Levenshtein distance on the same dataset and second, by comparing the quality of the induced vowel distances to acoustic differences.