Categorical tendencies in imitating self-produced isolated vowels
Speech Communication
Discrete Time Processing of Speech Signals
Discrete Time Processing of Speech Signals
Linear Prediction of Speech
Linear prediction using refined autocorrelation function
EURASIP Journal on Audio, Speech, and Music Processing
High-pitch formant estimation by exploiting temporal change of pitch
IEEE Transactions on Audio, Speech, and Language Processing
Hi-index | 0.00 |
The locations of formants in a speech signal are usually estimated by computing the linear predictive coefficients (LPC) over a sliding window and finding the peaks in the spectrum of the resulting LP filter. The peak locations are estimated either by root-solving or by computing a coarse spectrum and finding its maxima. We discuss four sources of systematic error in this analysis: (1) quantization of the speech signal due to the fundamental frequency, (2) incorrect order for the LP filter, (3) exclusive reliance upon root-solving, and (4) the three-point parabolic interpolation used to compensate for the coarse spectrum. We show that the expected error due to F0 quantization is ∼10% of F0, and that the other three sources can independently skew the final formant estimates by 10-80 Hz. We also show that errors due to incorrect filter order are related to systematic differences between speakers and phonetic classes, and that root-solving is especially error-prone for low formants or when formants are close to each other. We discuss methods for avoiding these errors and improving the accuracy of formant estimation, and give a heuristic for estimating the optimal filter order of a steady-state signal.