Multimodal generation in the COMIC dialogue system
ACLdemo '05 Proceedings of the ACL 2005 on Interactive poster and demonstration sessions
Unit selection in a concatenative speech synthesis system using a large speech database
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
Joint prosody prediction and unit selection for concatenative speech synthesis
ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
The listening room: a speech-based interactive art installation
Proceedings of the 15th international conference on Multimedia
Post-recording tool for instant casting movie system
MM '08 Proceedings of the 16th ACM international conference on Multimedia
Towards a neurocomputational model of speech production and perception
Speech Communication
Robust speaker-adaptive HMM-based text-to-speech synthesis
IEEE Transactions on Audio, Speech, and Language Processing
IEEE Transactions on Audio, Speech, and Language Processing
Synthesis of child speech with HMM adaptation and voice conversion
IEEE Transactions on Audio, Speech, and Language Processing
Generating tailored, comparative descriptions with contextually appropriate intonation
Computational Linguistics
Proceedings of the 2011 international conference on Virtual and mixed reality: systems and applications - Volume Part II
Application of Genetic Algorithm in unit selection for Malay speech synthesis system
Expert Systems with Applications: An International Journal
Evaluation of TTS systems in intelligibility and comprehension tasks
ROCLING '11 Proceedings of the 23rd Conference on Computational Linguistics and Speech Processing
Expressive speech synthesis: a review
International Journal of Speech Technology
Hi-index | 0.00 |
We present the implementation and evaluation of an open-domain unit selection speech synthesis engine designed to be flexible enough to encourage further unit selection research and allow rapid voice development by users with minimal speech synthesis knowledge and experience. We address the issues of automatically processing speech data into a usable voice using automatic segmentation techniques and how the knowledge obtained at labelling time can be exploited at synthesis time. We describe target cost and join cost implementation for such a system and describe the outcome of building voices with a number of different sized datasets. We show that, in a competitive evaluation, voices built using this technology compare favourably to other systems.