Rules for the generation of ToBI-based American English intonation
Speech Communication
Automatic corpus-based tone and break-index prediction using K-ToBI representation
ACM Transactions on Asian Language Information Processing (TALIP)
A comparison of algorithms for maximum entropy parameter estimation
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Using conditional random fields to predict pitch accents in conversational speech
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
High speed unknown word prediction using support vector machine for chinese text-to-speech systems
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
C-TOBI-Based pitch accent prediction using maximum-entropy model
ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part III
Hi-index | 0.00 |
Prosody modeling is critical in developing text-to-speech (TTS) systems where speech synthesis is used to automatically generate natural speech. In this paper, we present a prosody generation architecture based on Chinese Tone and Break Index (C-ToBI) representation. ToBI is a multi-tier representation system based on linguistic knowledge to transcribe events in an utterance. The TTS system which adopts ToBI as an intermediate representation is known to exhibit higher flexibility, modularity and domain/task portability compared with the direct prosody generation TTS systems. We model Chinese prosody generation as a classification problem and apply conditional Maximum Entropy (ME) classification to this problem. We empirically verify the usefulness of various natural language and phonology features to make well-integrated features for ME framework.