Non-stationary self-consistent acoustic objects as atoms of voiced speech

Authors:
Friedhelm R. Drepper
Affiliations:
Zentralinstitut für Elektronik, Forschungszentrum Jülich, Jülich, Germany
Venue:
NOLISP'07 Proceedings of the 2007 international conference on Advances in nonlinear speech processing
Year:
2007

Citing 5
Cited 0

Nonlinear time series analysis

Nonlinear time series analysis
Speech and Audio Signal Processing: Processing and Perception of Speech and Music

Speech and Audio Signal Processing: Processing and Perception of Speech and Music
Voiced speech as response of a self-consistent fundamental drive

Speech Communication
Voiced speech analysis by empirical mode decomposition

NOLISP'07 Proceedings of the 2007 international conference on Advances in nonlinear speech processing
A two-level drive – response model of non-stationary speech signals

NOLISP'05 Proceedings of the 3rd international conference on Non-Linear Analyses and Algorithms for Speech Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

To account for the strong non-stationarity of voiced speech and its nonlinear aero-acoustic origin, the classical source-filter model is extended to a cascaded drive-response model with a conventional linear secondary response, a synchronized and/or synchronously modulated primary response and a non-stationary fundamental drive which plays the role of the long time-scale part of the basic time-scale separation of acoustic perception. The transmission protocol of voiced speech is assumed to be based on non-stationary acoustic objects which can be synthesized as the described secondary response and which are analysed by introducing a self-consistent (filter stable) part-tone decomposition, suited to reconstruct the hidden fundamental drive and to confirm its topological equivalence to a glottal master oscillator. The filter-stable part-tone decomposition opens the option of a phase modulation transmission protocol of voiced speech. Aiming at communication channel invariant acoustic features of voiced speech, the phase modulation cues are expected to be particularly suited to extend and/or replace the classical feature vectors of phoneme and speaker recognition.