Perceptual and objective detection of discontinuities in concatenative speech synthesis
ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
Hi-index | 0.00 |
Binaural cue coding (BCC) was introduced as an efficient representation method for MPEG-4 SAC (Spatial Audio Coding). However, in a low bit-rate environment, the spectrum of BCC output signals degrades with respect to the perceptual level. The proposed system in this paper estimates VSLI (virtual source location information) as the side information. The VSLI is the angle representation of spatial images between channels on playback layout. The subjective assessment results show that the proposed method provides better audio quality than the BCC method for encoding multi-channel signals.