Cinematic techniques for speech processing: temporal decomposition and multivariate linear prediction

Authors:
Claude Montacié;Paul Deléglise;Frédéric Bimbot;Marie-José Caraty
Affiliations:
Laforia, Université P. & M. Curie, C.N.R.S., Paris Cedex 05, France;Département Signal, Télécom Paris, C.N.R.S., Paris Cedex 13, France;Département Signal, Télécom Paris, C.N.R.S., Paris Cedex 13, France;Laforia, Université P. & M. Curie, C.N.R.S., Paris Cedex 05, France
Venue:
ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Year:
1992

Citing 1
Cited 1

The String-to-String Correction Problem

Journal of the ACM (JACM)

Speaker identification experiments using HMMs

ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents two modelisations of the spectral evolution of speech signals capable of processing some aspects of the speech variability : the Temporal Decomposition and the Multivariate Linear Prediction. Carried out at Telecom Paris, a series of acoustic-phonetic decoding experiments, characterized by the use of spectral targets of the Temporal Decomposition techniques and a speaker-dependent mode, gives good results compared to a reference system (i.e., 70% vs 60% for the first choice). Using the original method developed by Laforia, a series of text-independent speaker recognition experiments, characterized by a long-term Multivariate Auto-Regressive modelisation, gives first-rate results (i.e., 98.4 % recognition rate for 420 speakers) without using more than one sentence. Taking into account the interpretation of the modelisations, these results show how interesting the cinematic models are, to obtain a reduced variability of the speech signal representation.