Boundary Refining Aiming at Speech Synthesis Applications

Authors:
Monique V. Nicodem;Sandra G. Kafka;Rui Seara, Jr.;Rui Seara
Affiliations:
LINSE --- Circuits and Signal Processing Laboratory Department of Electrical Engineering, Federal University of Santa Catarina, Brazil;LINSE --- Circuits and Signal Processing Laboratory Department of Electrical Engineering, Federal University of Santa Catarina, Brazil;LINSE --- Circuits and Signal Processing Laboratory Department of Electrical Engineering, Federal University of Santa Catarina, Brazil;LINSE --- Circuits and Signal Processing Laboratory Department of Electrical Engineering, Federal University of Santa Catarina, Brazil
Venue:
PROPOR '08 Proceedings of the 8th international conference on Computational Processing of the Portuguese Language
Year:
2008

Citing 6
Cited 0

Neural Networks: A Comprehensive Foundation

Neural Networks: A Comprehensive Foundation
Spoken Language Processing: A Guide to Theory, Algorithm, and System Development

Spoken Language Processing: A Guide to Theory, Algorithm, and System Development
Context-Dependent Boundary Model for Refining Boundaries Segmentation of TTS Units

IEICE - Transactions on Information and Systems
Unit selection in a concatenative speech synthesis system using a large speech database

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
Neural network boundary refining for automatic speech segmentation

ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 06
MLP-based phone boundary refining for a TTS database

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In concatenative synthesis, speech is produced by joining segments automatically selected among units contained in a previously segmented database. The synthetic speech resulting from such a technique is often improved when accurate segmentation tools are considered. The performance of these tools is often enhanced by a hybrid approach resulting from the association of an HMM modeling with a boundary refining process. Such a refining has been carried out sucessfully by using techniques based on neural networks. This paper presents a set of networks that outperform other topologies discussed in the literature. These networks are trained by performing a clusterization of the training set taking into consideration phonetic transitions with similarities to each other.