A model of segmental duration for speech synthesis in French
Speech Communication
Instance-Based Learning Algorithms
Machine Learning
A practical approach to feature selection
ML92 Proceedings of the ninth international workshop on Machine learning
C4.5: programs for machine learning
C4.5: programs for machine learning
Approximation and radial-basis-function networks
Neural Computation
Hypothesis-Driven Constructive Induction in AQ17-HCI: A Method and Experiments
Machine Learning - Special issue on evaluating and changing representation
The nature of statistical learning theory
The nature of statistical learning theory
Machine Learning
An introduction to text-to-speech synthesis
An introduction to text-to-speech synthesis
Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
IEEE Transactions on Pattern Analysis and Machine Intelligence
Fast training of support vector machines using sequential minimal optimization
Advances in kernel methods
A Theoretical Study on Six Classifier Fusion Strategies
IEEE Transactions on Pattern Analysis and Machine Intelligence
Spoken Language Processing: A Guide to Theory, Algorithm, and System Development
Spoken Language Processing: A Guide to Theory, Algorithm, and System Development
A perspective view and survey of meta-learning
Artificial Intelligence Review
Computational Statistics & Data Analysis - Nonlinear methods and data mining
Selective Rademacher Penalization and Reduced Error Pruning of Decision Trees
The Journal of Machine Learning Research
Modeling durations of syllables using neural networks
Computer Speech and Language
Segmental Duration Modeling for Greek Speech Synthesis
ICTAI '07 Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence - Volume 02
Bayesian networks for phone duration prediction
Speech Communication
Phone duration modeling using gradient tree boosting
Speech Communication
Iterative feature construction for improving inductive learning algorithms
Expert Systems with Applications: An International Journal
Speech segmentation using regression fusion of boundary predictions
Computer Speech and Language
Constructive induction on decision trees
IJCAI'89 Proceedings of the 11th international joint conference on Artificial intelligence - Volume 1
Complex concept acquisition through directed search and feature caching
IJCAI'93 Proceedings of the 13th international joint conference on Artifical intelligence - Volume 2
Constructing nominal X-of-N attributes
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Improving phone duration modelling using support vector regression fusion
Speech Communication
Generation of attributes for learning algorithms
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
On the mean accuracy of statistical pattern recognizers
IEEE Transactions on Information Theory
Improving model accuracy using optimal linear combinations of trained neural networks
IEEE Transactions on Neural Networks
A feature construction approach for genetic iterative rule learning algorithm
Journal of Computer and System Sciences
Hi-index | 0.00 |
We propose a two-stage phone duration modelling scheme, which can be applied for the improvement of prosody modelling in speech synthesis systems. This scheme builds on a number of independent feature constructors (FCs) employed in the first stage, and a phone duration model (PDM) which operates on an extended feature vector in the second stage. The feature vector, which acts as input to the first stage, consists of numerical and non-numerical linguistic features extracted from text. The extended feature vector is obtained by appending the phone duration predictions estimated by the FCs to the initial feature vector. Experiments on the American-English KED TIMIT and on the Modern Greek WCL-1 databases validated the advantage of the proposed two-stage scheme, improving prediction accuracy over the best individual predictor, and over a two-stage scheme which just fuses the first-stage outputs. Specifically, when compared to the best individual predictor, a relative reduction in the mean absolute error and the root mean square error of 3.9% and 3.9% on the KED TIMIT, and of 4.8% and 4.6% on the WCL-1 database, respectively, is observed.