Comparison of performance with voiced and whispered speech in word recognition and mean-formant-frequency discrimination

  • Authors:
  • Toshio Irino;Yoshie Aoki;Hideki Kawahara;Roy D. Patterson

  • Affiliations:
  • Faculty of Systems Engineering, Wakayama University, 930 Sakaedani, Wakayama 640-8510, Japan;Faculty of Systems Engineering, Wakayama University, 930 Sakaedani, Wakayama 640-8510, Japan;Faculty of Systems Engineering, Wakayama University, 930 Sakaedani, Wakayama 640-8510, Japan;Centre for the Neural Basis of Hearing, Department of Physiology, Development and Neuroscience, University of Cambridge, Downing Street, Cambridge CB2 3EG, United Kingdom

  • Venue:
  • Speech Communication
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

There has recently been a series of studies concerning the interaction of glottal pulse rate (GPR) and mean-formant-frequency (MFF) in the perception of speaker characteristics and speech recognition. This paper extends the research by comparing the recognition and discrimination performance achieved with voiced words to that achieved with whispered words. The recognition experiment shows that performance with whispered words is slightly worse than with voiced words at all MFFs when the GPR of the voiced words is in the middle of the normal range. But, as GPR decreases below this range, voiced-word performance decreases and eventually becomes worse than whispered-word performance. The discrimination experiment shows that the just noticeable difference (JND) for MFF is essentially independent of the mode of vocal excitation; the JND is close to 5% for both voiced and voiceless words for all speaker types. The interaction between GPR and VTL is interpreted in terms of the stability of the internal representation of speech which improves with GPR across the range of values used in these experiments.