A method for Vietnamese text normalization to improve the quality of speech synthesis

  • Authors:
  • Thu-Trang Thi Nguyen;Thanh Thi Pham;Do-Dat Tran

  • Affiliations:
  • Hanoi University of Technology, Hanoi, Vietnam;Hanoi University of Technology, Hanoi, Vietnam;Hanoi University of Technology, Hanoi, Vietnam

  • Venue:
  • Proceedings of the 2010 Symposium on Information and Communication Technology
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Being necessary for a Text-To-Speech (TTS) system, text-normalization is general a challenging problem, especially for Vietnamese because of the local context. Recent researches in text-normalization in Vietnamese for TTS systems are still at the beginning with very simple sets of ad hoc rules for individual cases in spite of the ambiguity of real text. The purpose of this paper is to take some initial steps towards methodically normalizing input text in Vietnamese for a TTS system. This paper proposes a categorization and a normalization model for Vietnamese text based on related results for other languages. An experimental application is implemented to demonstrate the model, which uses several techniques including letter language model and decision trees for classifying NSWs and both supervised and unsupervised approaches for expanding abbreviations.