Voiced speech as response of a self-consistent fundamental drive

  • Authors:
  • Friedhelm R. Drepper

  • Affiliations:
  • Forschungszentrum Jülich GmbH, 52425 Jülich, Germany

  • Venue:
  • Speech Communication
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Voiced segments of speech are assumed to be composed of non-stationary acoustic objects which can be described as stationary response of a non-stationary fundamental drive (FD) process and which are furthermore suited to reconstruct the hidden FD by using a voice adapted (self-consistent) part-tone decomposition of the speech signal. The universality and robustness of human pitch perception encourage the reconstruction of a band-limited FD in the frequency range of the pitch. The self-consistent decomposition of voiced continuants generates several part-tones which can piecewise be confirmed to be topologically equivalent to corresponding acoustic modes of the excitation on the transmitter side. As topologically equivalent image of a glottal master oscillator, the self-consistent FD is suited to serve as low frequency part of the basic time-scale separation of auditive perception and to describe the broadband voiced excitation as entrained (synchronized) and/or modulated primary response. Being guided by the acoustic correlates of pitch and loudness perception, the time-scale separation avoids the conventional assumption of stationary excitation and represents the basic decoding step of an advanced precision transmission protocol of self-consistent (voiced) acoustic objects. The present study is focussed on the adaptation of the trajectories (contours) of the centre filter frequency of the part-tones to the chirp of the glottal master oscillator.