An investigation of PLP and IMELDA acoustic representations and of their potential for combination

Authors:
M. J. Hunt;S. M. Richardson;D. C. Bateman;A. Piau
Affiliations:
Marconi Speech & Inf. Syst., Portsmouth, UK;Marconi Speech & Inf. Syst., Portsmouth, UK;Marconi Speech & Inf. Syst., Portsmouth, UK;Marconi Speech & Inf. Syst., Portsmouth, UK
Venue:
ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
Year:
1991

Citing 0
Cited 10

Significance of joint features derived from the modified group delay function in speech processing

EURASIP Journal on Audio, Speech, and Music Processing
Linear discriminant analysis for improved large vocabulary continuous speech recognition

ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Speech recognition in noise using a projection-based likelihood measure for mixture density HMM's

ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Spectral contrast normalization and other techniques for speech recognition in noise

ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Acoustic modelling of subword units in the Isadora speech recognizer

ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Exploiting prediction error in a predictive-based connectionist speech recognition system

ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
A comparison of composite features under degraded speech in speaker recognition

ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
Application of large vocabulary continuous speech recognition to topic and speaker identification using telephone speech

ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
Large vocabulary continuous speech recognition of wall street journal data

ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
Statistical modelling in continuous speech recognition (CSR)

UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Two acoustic representations, integrated Mel-scale representation with LDA (IMELDA) and perceptual linear prediction-root power sums (PLP-RPS), both of which have given good results in speech recognition tests, are explored. IMELDA is examined in the context of some related representations. Results of speaker-dependent and independent tests with digits and the alphabet suggest that the optimum PLP order is high and that the effectiveness of PLP-RPS stems not from its modeling of perceptual properties but from its approximation to a desirable statistical property attained exactly by IMELDA. A combined PLP-IMELDA representation is found to be generally more effective than PLP-RPS, but an IMELDA representation derived directly from a filter-bank provides similar results to PLP-IMELDA at a lower computational cost.