Review: Statistical parametric speech synthesis

Authors:
Heiga Zen;Keiichi Tokuda;Alan W. Black
Affiliations:
Department of Computer Science and Engineering, Nagoya Institute of Technology, Gokiso-cho, Showa-ku, Nagoya 466-8555, Japan and Cambridge Research Laboratory, Toshiba Research Europe Ltd., 208 Ca ...;Department of Computer Science and Engineering, Nagoya Institute of Technology, Gokiso-cho, Showa-ku, Nagoya 466-8555, Japan;Language Technologies Institute, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA 15213, USA
Venue:
Speech Communication
Year:
2009

Citing 37
Cited 33

Continuously variable duration hidden Markov models for automatic speech recognition

Computer Speech and Language
Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones

Speech Communication
A generalized hidden Markov model with state-conditioned trend functions of time for the speech signal

Signal Processing
Speech spectrum conversion based on speaker interpolation and multi-functional representation with weighting by radial basis function networks

Speech Communication - Special issue: voice conversion: state of the art and perspectives
Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: possible role of a repetitive structure in sounds

Speech Communication
Voice Characteristics Conversion for HMM-based Speech Synthesis System

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 3 - Volume 3
Speech Synthesis with Various Emotional Expressions and Speaking Styles by Style Interpolation and Morphing

IEICE - Transactions on Information and Systems
A Style Adaptation Technique for Speech Synthesis Using HSMM and Suprasegmental Features

IEICE - Transactions on Information and Systems
Implementation and Evaluation of an HMM-Based Korean Speech Synthesis System

IEICE - Transactions on Information and Systems
Hybrid Voice Conversion of Unit Selection and Generation Using Prosody Dependent HMM

IEICE - Transactions on Information and Systems
Details of the Nitech HMM-Based Speech Synthesis System for the Blizzard Challenge 2005

IEICE - Transactions on Information and Systems
Two-Band Excitation for HMM-Based Speech Synthesis

IEICE - Transactions on Information and Systems
Statistical modeling for unit selection in speech synthesis

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Average-Voice-Based Speech Synthesis Using HSMM-Based Speaker Adaptation and Adaptive Training

IEICE - Transactions on Information and Systems
Unit selection in a concatenative speech synthesis system using a large speech database

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
The HDM: a segmental hidden dynamic model of coarticulation

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Assessment and correction of voice quality variabilities in large speech databases for concatenative speech synthesis

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Adaptation of pitch and spectrum for HMM-based speech synthesis using MLLR

ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
Speech synthesis using stochastic Markov graphs

ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
Trainable speech synthesis with trended hidden Markov models

ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
Articulatory feature recognition using dynamic Bayesian networks

Computer Speech and Language
A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis

IEICE - Transactions on Information and Systems
A Hidden Semi-Markov Model-Based Speech Synthesis System

IEICE - Transactions on Information and Systems
A Style Control Technique for HMM-Based Expressive Speech Synthesis

IEICE - Transactions on Information and Systems
Minimum generation error training by using original spectrum as reference for log spectral distortion measure

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
A polynomial segment model based statistical parametric speech synthesis sytem

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Trajectory training considering global variance for HMM-based speech synthesis

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Full covariance state duration modeling for HMM-based speech synthesis

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Probablistic modelling of F0 in unvoiced regions in HMM based speech synthesis

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Improved prosody generation by maximizing joint likelihood of state and longer units

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Using a pitch-synchronous residual codebook for hybrid HMM/frame selection speech synthesis

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
An HMM-based mandarin chinese text-to-speech system

ISCSLP'06 Proceedings of the 5th international conference on Chinese Spoken Language Processing
Analysis of Speaker Adaptation Algorithms for HMM-Based Speech Synthesis and a Constrained SMAPLR Adaptation Algorithm

IEEE Transactions on Audio, Speech, and Language Processing
Speech Recognition Using Linear Dynamic Models

IEEE Transactions on Audio, Speech, and Language Processing
Structured speech modeling

IEEE Transactions on Audio, Speech, and Language Processing
Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory

IEEE Transactions on Audio, Speech, and Language Processing
HMM-based Korean speech synthesis system for hand-held devices

IEEE Transactions on Consumer Electronics

Modeling and interpolation of Austrian German and Viennese dialect in HMM-based speech synthesis

Speech Communication
Variety Is the Spice of (Virtual) Life

MIG '09 Proceedings of the 2nd International Workshop on Motion in Games
Analysis of statistical parametric and unit selection speech synthesis systems applied to emotional speech

Speech Communication
Voice conversion based on weighted frequency warping

IEEE Transactions on Audio, Speech, and Language Processing
Thousands of voices for HMM-based speech synthesis: analysis and application of TTS systems built on various ASR corpora

IEEE Transactions on Audio, Speech, and Language Processing
Synthesis of child speech with HMM adaptation and voice conversion

IEEE Transactions on Audio, Speech, and Language Processing
Statistical text-to-speech synthesis based on segment-wise representation with a norm constraint

IEEE Transactions on Audio, Speech, and Language Processing
The user model-based summarize and refine approach improves information presentation in spoken dialog systems

Computer Speech and Language
Czech HMM-based speech synthesis

TSD'10 Proceedings of the 13th international conference on Text, speech and dialogue
The Romanian speech synthesis (RSS) corpus: Building a high quality HMM-based speech synthesis system using a high sampling rate

Speech Communication
Speech modeling using the complex cepstrum

Proceedings of the Third COST 2102 international training school conference on Toward autonomous, adaptive, and context-aware multimodal interfaces: theoretical and practical issues
Context adaptive training with factorized decision trees for HMM-based statistical parametric speech synthesis

Speech Communication
A review of personality in voice-based man machine interaction

HCII'11 Proceedings of the 14th international conference on Human-computer interaction: interaction techniques and environments - Volume Part II
Development of syllable-based text to speech synthesis system in Bengali

International Journal of Speech Technology
Czech HMM-based speech synthesis: experiments with model adaptation

TSD'11 Proceedings of the 14th international conference on Text, speech and dialogue
Voice banking and voice reconstruction for MND patients

The proceedings of the 13th international ACM SIGACCESS conference on Computers and accessibility
Synthesis and evaluation of conversational characteristics in HMM-based speech synthesis

Speech Communication
Perceptual effects of the degree of articulation in HMM-based speech synthesis

NOLISP'11 Proceedings of the 5th international conference on Advances in nonlinear speech processing
Production of filled pauses in concatenative speech synthesis based on the underlying fluent sentence

Speech Communication
Probabilistic dialogue models with prior domain knowledge

SIGDIAL '12 Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Personalising speech-to-speech translation: Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis

Computer Speech and Language
An intuitive style control technique in HMM-based expressive speech synthesis using subjective style intensity and multiple-regression global variance model

Speech Communication
A consistency analysis on an acoustic module for Mandarin text-to-speech

Speech Communication
An approach to intelligent signal processing

COST'11 Proceedings of the 2011 international conference on Cognitive Behavioural Systems
Evaluating the intelligibility benefit of speech modifications in known noise conditions

Speech Communication
Expressive speech synthesis: a review

International Journal of Speech Technology
Complex cepstrum for statistical parametric speech synthesis

Speech Communication
Statistical parametric speech synthesis for Ibibio

Speech Communication
Synthesis and perception of breathy, normal, and Lombard speech in the presence of noise

Computer Speech and Language
Intelligibility enhancement of HMM-generated speech in additive noise by modifying Mel cepstral coefficients to increase the glimpse proportion

Computer Speech and Language
Analysis and HMM-based synthesis of hypo and hyperarticulated speech

Computer Speech and Language
HMM-based speech synthesis with various degrees of articulation: A perceptual study

Neurocomputing
Pitch-Scaled Spectrum Based Excitation Model for HMM-based Speech Synthesis

Journal of Signal Processing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This review gives a general overview of techniques used in statistical parametric speech synthesis. One instance of these techniques, called hidden Markov model (HMM)-based speech synthesis, has recently been demonstrated to be very effective in synthesizing acceptable speech. This review also contrasts these techniques with the more conventional technique of unit-selection synthesis that has dominated speech synthesis over the last decade. The advantages and drawbacks of statistical parametric synthesis are highlighted and we identify where we expect key developments to appear in the immediate future.