A tutorial on support vector regression
Statistics and Computing
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Prosody dependent speech recognition on radio news corpus of American English
IEEE Transactions on Audio, Speech, and Language Processing
Hi-index | 0.00 |
For voice rehabilitation, speech intelligibility is an important criterion. Automatic evaluation of intelligibility has been shown to be successful for automatic speech recognition methods combined with prosodic analysis. In this paper, this method is extended by using measures based on the Cepstral Peak Prominence (CPP). 73 hoarse patients (48.3±16.8 years) uttered the vowel /e/ and read the German version of the text "The North Wind and the Sun". Their intelligibility was evaluated perceptually by 5 speech therapists and physicians according to a 5-point scale. Support Vector Regression (SVR) revealed a feature set with a human-machine correlation of r = 0.85 consisting of the word accuracy, smoothed CPP computed from a speech section, and three prosodic features (normalized energy of word-pause-word intervals, F0 value at voice offset in a word, and standard deviation of jitter). The average human-human correlation was r = 0.82. Hence, the automatic method can be a meaningful objective support for perceptual analysis.