Main dialect identification in Mainland China, Hong Kong and Taiwan

  • Authors:
  • Dunxiao Wei;Jun-Yong Zhu;Wei-Shi Zheng;Jianhuang Lai

  • Affiliations:
  • School of Mathematics and Computational Science, Sun Yat-Sen University, Guangzhou, P.R. China;School of Mathematics and Computational Science, Sun Yat-Sen University, Guangzhou, P.R. China;School of Information Science and Technology, Sun Yat-Sen University, Guangzhou, P.R. China;School of Information Science and Technology, Sun Yat-Sen University, Guangzhou, P.R. China

  • Venue:
  • CCBR'11 Proceedings of the 6th Chinese conference on Biometric recognition
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

As an emerging field of speech recognition, dialect identification plays an important role for promoting applications of speech recognition technology. Since the communications among Mainland China, Hong Kong and Taiwan are becoming frequently, it is particularly necessary to identify their dialects. This paper makes contributions to this issue in the following threefolds: 1) we build a speech corpus for main dialects of the three areas; 2) we use the popular GMM based method to extensively evaluate the main dialects between Mainland China and Hong Kong and the ones between Mainland China and Taiwan, and we find the differences between Mainland China Mandarin and Taiwan Mandarin are much smaller than those between Mandarin and Cantonese, resulting in unsatisfactory results in the latter case; 3) we propose an improved method based on the analysis of GMM, namely, maximum KL distance based Gaussian component selection (MKLD-GCS) in order to improve the performance of dialect identification between Mainland China Mandarin and Taiwan Mandarin. Experimental results show that our proposed method obtains better identification performance than related methods.