Automatic Classification of Expressiveness in Speech: A Multi-corpus Study

  • Authors:
  • Mohammad Shami;Werner Verhelst

  • Affiliations:
  • Vrije Universiteit Brussel, Interdisciplinary Institute for Broadband Technology - IBBT, department ETRO-DSSP, Pleinlaan 2, B-1050 Brussels, Belgium.;Vrije Universiteit Brussel, Interdisciplinary Institute for Broadband Technology - IBBT, department ETRO-DSSP, Pleinlaan 2, B-1050 Brussels, Belgium.

  • Venue:
  • Speaker Classification II
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a study on the automatic classification of expressiveness in speech using four databases that belong to two distinct groups: the first group of two databases contains adult speech directed to infants, while the second group contains adult speech directed to adults. We performed experiments with two approaches for feature extraction, the approach developed for Sony's robotic dog AIBO (AIBO) and a segment based approach (SBA), and three machine learning algorithms for training the classifiers. In mono corpus experiments, the classifiers were trained and tested on each database individually. The results show that AIBO and SBA are competitive on the four databases considered, although the AIBO approach works better with long utterances whereas the SBA seems to be better suited for classification of short utterances. When training was performed on one database and testing on another database of the same group, little generalization across the databases happened because emotions with the same label occupy different regions of the feature space for the different databases. Fortunately, when the databases are merged, classification results are comparable to within-database experiments, indicating that the existing approaches for the classification of emotions in speech are efficient enough to handle larger amounts of training data without any reduction in classification accuracy, which should lead to classifiers that are more robust to varying styles of expressiveness in speech.