Speech Synthesis for Error Training Models in CALL

  • Authors:
  • Xin Zhang;Qin Lu;Jiping Wan;Guangguang Ma;Tin Shing Chiu;Weiping Ye;Wenli Zhou;Qiao Li

  • Affiliations:
  • Department of Electronics, Beijing Normal University, China;Department of Computing, Hong Kong Polytechnic University, China;Department of Electronics, Beijing Normal University, China;Department of Electronics, Beijing Normal University, China;Department of Computing, Hong Kong Polytechnic University, China;Department of Electronics, Beijing Normal University, China;Department of Electronics, Beijing Normal University, China;Department of Electronics, Beijing Normal University, China

  • Venue:
  • ICCPOL '09 Proceedings of the 22nd International Conference on Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

A computer assisted pronunciation teaching system (CAPT) is a fundamental component in a computer assisted language learning system (CALL). A speech recognition based CAPT system often requires a large amount of speech data to train the incorrect phone models in its speech recognizer. But collecting incorrectly pronounced speech data is a labor intensive and costly work. This paper reports an effort on training the incorrect phone models by making use of synthesized speech data. A special formant speech synthesizer is designed to filter the correctly pronounced phones into incorrect phones by modifying the formant frequencies. In a Chinese Putonghua CALL system for native Cantonese speakers to learn Mandarin, a small experimental CAPT system is built with a synthetic speech data trained recognizer. Evaluation shows that a CAPT system using synthesized data can perform as good as or even better than that using real data provided that the size of the synthetic data are large enough.