Prosody in Speech Understanding Systems
Prosody in Speech Understanding Systems
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Understanding spontaneous speech: the Phoenix system
ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
Audiovisual recognition of spontaneous interest within conversations
Proceedings of the 9th international conference on Multimodal interfaces
On the use of nonverbal speech sounds in human communication
COST 2102'07 Proceedings of the 2007 COST action 2102 international conference on Verbal and nonverbal communication behaviours
EURASIP Journal on Audio, Speech, and Music Processing
Detecting laughter in spontaneous speech by constructing laughter bouts
International Journal of Speech Technology
Paralinguistics in speech and language-State-of-the-art and the challenge
Computer Speech and Language
Image and Vision Computing
Hi-index | 0.00 |
Non-verbal vocalisations such as laughter, breathing, hesitation, and consent play an important role in the recognition and understanding of human conversational speech and spontaneous affect. In this contribution we discuss two different strategies for robust discrimination of such events: dynamic modelling by a broad selection of diverse acoustic Low-Level-Descriptors vs. static modelling by projection of these via statistical functionals onto a 0.6k feature space with subsequent de-correlation. As classifiers we employ Hidden Markov Models, Conditional Random Fields, and Support Vector Machines, respectively. For discussion of extensive parameter optimisation test-runs with respect to features and model topology, 2.9k non-verbals are extracted from the spontaneous Audio-Visual Interest Corpus. 80.7% accuracy can be reported with, and 92.6% without a garbage model for the discrimination of the named classes.