Hierarchical spectral partitioning of bipartite graphs to cluster dialects and identify distinguishing features

  • Authors:
  • Martijn Wieling;John Nerbonne

  • Affiliations:
  • University of Groningen, The Netherlands;University of Groningen, The Netherlands

  • Venue:
  • TextGraphs-5 Proceedings of the 2010 Workshop on Graph-based Methods for Natural Language Processing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this study we apply hierarchical spectral partitioning of bipartite graphs to a Dutch dialect dataset to cluster dialect varieties and determine the concomitant sound correspondences. An important advantage of this clustering method over other dialectometric methods is that the linguistic basis is simultaneously determined, bridging the gap between traditional and quantitative dialectology. Besides showing that the results of the hierarchical clustering improve over the flat spectral clustering method used in an earlier study (Wieling and Nerbonne, 2009), the values of the second singular vector used to generate the two-way clustering can be used to identify the most important sound correspondences for each cluster. This is an important advantage of the hierarchical method as it obviates the need for external methods to determine the most important sound correspondences for a geographical cluster.