Identifying linguistic structure in a quantitative analysis of dialect pronunciation
ACL '07 Proceedings of the 45th Annual Meeting of the ACL: Student Research Workshop
Tensor Decompositions and Applications
SIAM Review
TextGraphs-4 Proceedings of the 2009 Workshop on Graph-based Methods for Natural Language Processing
Hi-index | 0.00 |
In this paper we apply the multi-way decomposition method parafac in order to detect the most prominent sound changes in dialect variation. We investigate various phonetic patterns, both in stressed and unstressed syllables. We proceed from regular sound correspondences which are automatically extracted from the aligned transcriptions and analyzed using parafac. This enables us to analyze simultaneously the co-occurrence patterns of all sound correspondences found in the data set and determine the most important factors of the variation. The first ten dimensions are examined in more detail by recovering the geographical distribution of the extracted correspondences. We also compare dialect divisions based on the extracted correspondences to the divisions based on the whole data set and to the traditional scholarship as well. The results show that parafac can be successfully used to detect the linguistic basis of the automatically obtained dialect divisions.