C-means clustering applied to speech discrimination

  • Authors:
  • J. M. Górriz;J. Ramírez;I. Turias;C. G. Puntonet;J. González;E. W. Lang

  • Affiliations:
  • Dpt. Signal Theory, Networking and communications, University of Granada, Spain;Dpt. Signal Theory, Networking and communications, University of Granada, Spain;Dpt. Computer Science, University of Cádiz, Spain;Dpt. Computer Architecture and Technology, University of Granada, Spain;Dpt. Computer Architecture and Technology, University of Granada, Spain;AG Neuro- und Bioinformatik, Universität Regensburg, Deutschland

  • Venue:
  • ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part I
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

An effective voice activity detection (VAD) algorithm is proposed for improving speech recognition performance in noisy environments. The proposed speech/pause discrimination method is based on a hard-decision clustering approach built over a set of subband log-energies. Detecting the presence of speech frames (a new cluster) is achieved using a basic sequential algorithm scheme (BSAS) according to a given “distance” (in this case, geometrical distance) and a suitable threshold. The accuracy of the Cl-VAD algorithm lies in the use of a decision function defined over a multiple-observation (MO) window of averaged subband log-energies and the modeling of noise subspace into cluster prototypes. In addition, time efficiency is also reached due to the clustering approach which is fundamental in VAD real time applications, i.e. speech recognition. An exhaustive analysis on the Spanish SpeechDat-Car databases is conducted in order to assess the performance of the proposed method and to compare it to existing standard VAD methods. The results show improvements in detection accuracy over standard VADs and a representative set of recently reported VAD algorithms.