Segmental duration modelling in turkish

Authors:
Özlem Öztürk;Tolga Çiloğlu
Affiliations:
Electrical and Electronics Engineering Department, Dokuz Eylul University, Izmir, Turkey;Electrical and Electronics Engineering Department, Middle East Technical University, Ankara, Turkey
Venue:
TSD'06 Proceedings of the 9th international conference on Text, Speech and Dialogue
Year:
2006

Citing 2
Cited 0

Data mining: practical machine learning tools and techniques with Java implementations

Data mining: practical machine learning tools and techniques with Java implementations
Tree-based modeling of prosodic phrasing and segmental duration for Korean TTS systems

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

Naturalness of synthetic speech highly depends on appropriate modelling of prosodic aspects Mostly, three prosody components are modelled: segmental duration, pitch contour and intensity In this study, we present our work on modelling segmental duration in Turkish using machine-learning algorithms, especially Classification and Regression Trees The models predict phone durations based on attributes such as current, preceding and following phones' identities, stress, part-of-speech, word length in number of syllables, and position of word in utterance extracted from a speech corpus Obtained models predict segment durations better than mean duration approximations (~0.77 Correlation Coefficient, and 20.4 ms Root-Mean Squared Error) In order to improve prediction performance further, attributes used to develop segmental duration are optimized by means of Sequential Forward Selection method As a result of Sequential Forward Selection method, phone identity, neighboring phone identities, lexical stress, syllable type, part-of-speech, phrase break information, and location of word in the phrase constitute optimum attribute set for phoneme duration modelling.