From text to speech: the MITalk system
From text to speech: the MITalk system
A computational grammar of discourse-neutral prosodic phrasing in English
Computational Linguistics
Pauses and the temporal structure of speech
Fundamentals of speech synthesis and speech recognition
Assessing agreement on classification tasks: the kappa statistic
Computational Linguistics
Selection of relevant features and examples in machine learning
Artificial Intelligence - Special issue on relevance
A hierarchical stochastic model for automatic prediction of prosodic boundary location
Computational Linguistics
Combining stochastic and rule-based methods for disambiguation in agglutinative languages
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Learning to predict pitch accents and prosodic boundaries in Dutch
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
New statistical methods for phrase break prediction
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Stochastic and syntactic techniques for predicting phrase breaks
Computer Speech and Language
A prosodic phrasing model for a Korean text-to-speech synthesis system
Computer Speech and Language
Experimental evaluation of tree-based algorithms for intonational breaks representation
TSD'05 Proceedings of the 8th international conference on Text, Speech and Dialogue
Robust rule-based method for automatic break assignment in russian texts
TSD'05 Proceedings of the 8th international conference on Text, Speech and Dialogue
Hi-index | 0.00 |
This paper presents the evaluation of automatic break insertion for standard Basque. Basque is an agglutinative and inflected language and POS features, widely used for other languages, are not enough to accurately predict the insertion of breaks in the text. Other morpho-syntactic features, like grammatical case and information about syntagms have also been taken into account. With a textual corpus specially gathered for this study where the sentence internal punctuation marks have been removed, CARTs have been used to predict break locations. After applying parameter selection to the whole morpho-syntactic feature set, the best features were employed to build two CARTs, one that gives the same importance to deletion and insertion errors, T1, and another one, T2, that tries to minimize insertion errors. The objective evaluation of the break insertion algorithms gives a @k statistic of 0.518 and an F of 0.757 for T1 tree. The algorithms have also been subjectively evaluated and although T1 had better objective measures, the number of serious errors made by this tree is larger than the number of serious errors made by T2.