Improved speech recognition via speaker stress directed classification

  • Authors:
  • B. D. Womak;J. H. L. Hansen

  • Affiliations:
  • Robust Speech Process. Lab., Duke Univ., Durham, NC, USA;-

  • Venue:
  • ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

Speech production variations due to perceptually induced stress contribute significantly to reduced speech processing performance. This study proposes an algorithm for estimation of the degree of perceptually induced stress. It is suggested that the resulting stress score could be integrated into speech processing algorithms to improve robustness in adverse conditions. First, results from a previous study motivate selection of a targeted set of speech features across phoneme and stress groups to improve stress classification performance. Analysis of articulatory, excitation, and cepstral based features is conducted using a previously established stressed speech database (SUSAS). Targeted feature sets are selected across ten stress conditions (including Apache helicopter, angry, clear, Lombard effect, loud, etc.). Next, an improved targeted feature stress classification system is developed and evaluated achieving rates of 91.01%. Finally, application of stress classification is incorporated into a stress directed speech recognition system. An improvement of +10.14% and +15.43% over conventionally trained neutral and multi-style trained recognizers is demonstrated using the new stress directed recognition system.