Mining intonation corpora using knowledge driven sequential clustering

Authors:
David Escudero-Mancebo;Valentín Cardeñoso-Payo
Affiliations:
Department of Computer Science, University of Valladolid, Valladolid, Spain;Department of Computer Science, University of Valladolid, Valladolid, Spain
Venue:
IBERAMIA-SBIA'06 Proceedings of the 2nd international joint conference, and Proceedings of the 10th Ibero-American Conference on AI 18th Brazilian conference on Advances in Artificial Intelligence
Year:
2006

Citing 5
Cited 0

From text to speech: the MITalk system

From text to speech: the MITalk system
Data clustering: a review

ACM Computing Surveys (CSUR)
Prosody-based automatic segmentation of speech into sentences and topics

Speech Communication - Special issue on accessing information in spoken audio
Developments and paradigms in intonation research

Speech Communication
Multilingual Text-to-Speech Synthesis

Multilingual Text-to-Speech Synthesis

Quantified Score

Hi-index	0.00

Visualization

Abstract

This work presents a mining methodology designed to cope with the usual data scarcity problems of intonation corpora which arises from the high variability of prosodic information. The methodology is an adaptation of a basic agglomerative clustering technique, guided by a set of domain constraints. The peculiarities of the text-to-speech intonation modelling problem are considered in order to fix the initial configuration of the cluster and the criteria to merge classes and stopping their splitting. The scarcity problem poses the need to apply a sequential selection mechanism of prosodic features, in order to obtain the initial set of classes in the cluster. A searching strategy to select the best class among a set of alternatives is proposed, which provides useful prediction models for accurate synthetic intonation. Visualization of final classes by means of a modified decision tree brings graphical cues about contrastable prosodic information of the intonation corpus.