Temporal information in gated stop consonants

Authors:
Michael Kiefte
Affiliations:
Dalhousie University, School of Human Communication Disorders, 5599 Fenwick Street, Halifax, Nova Scotia, Canada B3H 1R2
Venue:
Speech Communication
Year:
2003

Citing 4
Cited 0

Discrete-time signal processing

Discrete-time signal processing
Models for the production and acoustics of stop consonants

Speech Communication - Speech science and technology: a selection from the papers presented at the Fourth International Conference in Speech Science and Technology (SST-92)
Pattern Recognition and Neural Networks

Pattern Recognition and Neural Networks
Modern Applied Statistics with S

Modern Applied Statistics with S

Quantified Score

Hi-index	0.00

Visualization

Abstract

The goal of the present paper is to assess the importance of dynamic spectral information in short, gated stop bursts. Automatic classification of naturally produced stimuli shows that a dynamic spectral representation gives lower misclassification rates than a static one for place-of-articulation distinctions in stimuli longer than 10 ms. At shorter durations, no significant difference is found. Human listeners were then asked to categorize 10- and 20-ms naturally produced, gated bursts in each of two conditions: unprocessed and temporally distorted. At 20 ms, correct identification was significantly lower for the distorted stimuli, while at 10 ms, no significant difference was found, It is shown that the largest changes in listeners' categorization occur with voiced stops with voice-onset times (VOTs) less than the duration of the stimuli; it is hypothesized that the temporal distortion of the onset of voicing contributes largely to the changes in categorization. It is then shown that the perception of voiceless gated stop bursts remains unaffected by the temporal distortion. These results are also supported by statistical models that compare static versus dynamic representations of the stimuli. It is shown that dynamic properties of stop bursts are important only when they include VOT information--i.e., dynamic spectral properties within isolated bursts appear to contain no phonetic information up to 20ms.