A decision-theoretic generalization of on-line learning and an application to boosting
Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Automatic ToBI prediction and alignment to speed manual labeling of prosody
Speech Communication - Special issue on speech annotation and corpus tools
Context-Dependent Boundary Model for Refining Boundaries Segmentation of TTS Units
IEICE - Transactions on Information and Systems
Selecting non-uniform units from a very large corpus for concatenative speech synthesizer
ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
Enrich web applications with voice internet persona text-to-speech for anyone, anywhere
HCI'07 Proceedings of the 12th international conference on Human-computer interaction: intelligent multimodal interaction environments
Hi-index | 0.00 |
Voice database is one of the most important parts in TTS systems. However, creating a high quality new TTS voice is not an easy task even for a professional team. The whole process is rather complicated and contains plenty minutiae that should be handled carefully. In fact, in many stages, human interference such as manually checking or labeling is necessary. In multi-lingual situations, it is more challenge to find qualified people to do this kind of interference. That’s why most state-of-the-art TTS systems can provide only a few voices. In this paper, we outline a uniform paradigm for creating multi-lingual TTS voice databases. It focuses on technologies that can either improve the scalability of data collection or reduce human interference such as manually checking or labeling. With this paradigm, we decrease the complexity and work load of the task.