Unsupervised learning of phonemes of whispered speech in a noisy environment based on convolutive non-negative matrix factorization

  • Authors:
  • Jian Zhou;Ruiyu Liang;Li Zhao;Liang Tao;Cairong Zou

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2014

Quantified Score

Hi-index 0.07

Visualization

Abstract

This paper focuses on the development of an algorithm that can be optimized for a specific acoustic environment to improve the intelligibility of whispered speech. A new convolutive non-negative matrix factorization (NMF) algorithm is proposed to extract phoneme bases from noisy whispered speech with the noise bases from prior learning; these noise bases are obtained from training using the conventional non-negative matrix factorization. The divergence function with a sparseness constraint term is selected as the objective function in the developed algorithm to obtain multiplicative update rules of the phoneme base matrix and the corresponding weight matrix. The weights of the noise bases from prior learning are also updated in the phoneme learning stage. Listening experiments were conducted to assess the intelligibility performance of speech synthesized using the proposed algorithm. The experimental results indicate that the proposed algorithm is very effective for improving the intelligibility of whispers in various noise contexts, and it outperforms conventional algorithms.