Cinematic techniques for speech processing: temporal decomposition and multivariate linear prediction

  • Authors:
  • Claude Montacié;Paul Deléglise;Frédéric Bimbot;Marie-José Caraty

  • Affiliations:
  • Laforia, Université P. & M. Curie, C.N.R.S., Paris Cedex 05, France;Département Signal, Télécom Paris, C.N.R.S., Paris Cedex 13, France;Département Signal, Télécom Paris, C.N.R.S., Paris Cedex 13, France;Laforia, Université P. & M. Curie, C.N.R.S., Paris Cedex 05, France

  • Venue:
  • ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
  • Year:
  • 1992

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents two modelisations of the spectral evolution of speech signals capable of processing some aspects of the speech variability : the Temporal Decomposition and the Multivariate Linear Prediction. Carried out at Telecom Paris, a series of acoustic-phonetic decoding experiments, characterized by the use of spectral targets of the Temporal Decomposition techniques and a speaker-dependent mode, gives good results compared to a reference system (i.e., 70% vs 60% for the first choice). Using the original method developed by Laforia, a series of text-independent speaker recognition experiments, characterized by a long-term Multivariate Auto-Regressive modelisation, gives first-rate results (i.e., 98.4 % recognition rate for 420 speakers) without using more than one sentence. Taking into account the interpretation of the modelisations, these results show how interesting the cinematic models are, to obtain a reduced variability of the speech signal representation.