On the limited memory BFGS method for large scale optimization
Mathematical Programming: Series A and B
Stochastic Complexity in Statistical Inquiry Theory
Stochastic Complexity in Statistical Inquiry Theory
Voice Characteristics Conversion for HMM-based Speech Synthesis System
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 3 - Volume 3
IEICE - Transactions on Information and Systems
Details of the Nitech HMM-Based Speech Synthesis System for the Blizzard Challenge 2005
IEICE - Transactions on Information and Systems
Adaptation of pitch and spectrum for HMM-based speech synthesis using MLLR
ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis
IEICE - Transactions on Information and Systems
A Hidden Semi-Markov Model-Based Speech Synthesis System
IEICE - Transactions on Information and Systems
Robust speaker-adaptive HMM-based text-to-speech synthesis
IEEE Transactions on Audio, Speech, and Language Processing
Hi-index | 0.00 |
We describe a statistical parametric speech synthesis system developed by a joint group from the Nagoya Institute of Technology (Nitech) and the Nara Institute of Science and Technology (NAIST) for the annual open evaluation of text-to-speech synthesis systems named Blizzard Challenge 2006. To improve our 2005 system (Nitech-HTS 2005), we investigated new features such as mel-generalized cepstrum-based line spectral pairs (MGC-LSPs), maximum likelihood linear transform (MLLT), and a full covariance global variance (GV) probability density function (pdf). A combination of mel-cepstral coefficients, MLLT, and full covariance GV pdf scored highest in subjective listening tests, and the 2006 system performed significantly better than the 2005 system. The Blizzard Challenge 2006 evaluations show that Nitech-NAIST-HTS 2006 is competitive even when working with relatively large speech databases.