Unsupervised learning of phonemes of whispered speech in a noisy environment based on convolutive non-negative matrix factorization

Authors:
Jian Zhou;Ruiyu Liang;Li Zhao;Liang Tao;Cairong Zou
Affiliations:
-;-;-;-;-
Venue:
Information Sciences: an International Journal
Year:
2014

Citing 20
Cited 0

Assessment for automatic speech recognition II: NOISEX-92: a database and an experiment to study the effect of additive noise on speech recognition systems

Speech Communication - Special issue on speech processing in adverse conditions
A new approach to phoneme recognition by phoneme filter neural networks

Information Sciences: an International Journal
Phoneme recognition using wavelet based features

Information Sciences—Informatics and Computer Science: An International Journal - Special issue on recent advances in soft computing
Fundamental frequency estimation of voice of patients with laryngeal disorders

Information Sciences—Informatics and Computer Science: An International Journal - Special issue: Spoken language analysis, modeling and recognition-statistical and adaptive connectionist approaches
Computational Auditory Scene Analysis: Principles, Algorithms, and Applications

Computational Auditory Scene Analysis: Principles, Algorithms, and Applications
Discovering speech phones using convolutive non-negative matrix factorisation with a sparseness constraint

Neurocomputing
Enhanced clustering of biomedical documents using ensemble non-negative matrix factorization

Information Sciences: an International Journal
On the effectiveness of subwords for lexical cohesion based story segmentation of Chinese broadcast news

Information Sciences: an International Journal
Multi-adjoint property-oriented and object-oriented concept lattices

Information Sciences: an International Journal
A survey of techniques for incremental learning of HMM parameters

Information Sciences: an International Journal
Privileged information for data clustering

Information Sciences: an International Journal
Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria

IEEE Transactions on Audio, Speech, and Language Processing
Convolutive Speech Bases and Their Application to Supervised Speech Separation

IEEE Transactions on Audio, Speech, and Language Processing
Reasons why Current Speech-Enhancement Algorithms do not Improve Speech Intelligibility and Suggested Solutions

IEEE Transactions on Audio, Speech, and Language Processing
Comparison of performance with voiced and whispered speech in word recognition and mean-formant-frequency discrimination

Speech Communication
An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech

IEEE Transactions on Audio, Speech, and Language Processing
Fusion of supervised and unsupervised learning for improved classification of hyperspectral images

Information Sciences: an International Journal
Granular modelling of signals: A framework of Granular Computing

Information Sciences: an International Journal
Rough set model based on formal concept analysis

Information Sciences: an International Journal
Acoustic analysis and feature transformation from neutral to whisper for speaker identification within whispered speech audio streams

Speech Communication

Quantified Score

Hi-index	0.07

Visualization

Abstract

This paper focuses on the development of an algorithm that can be optimized for a specific acoustic environment to improve the intelligibility of whispered speech. A new convolutive non-negative matrix factorization (NMF) algorithm is proposed to extract phoneme bases from noisy whispered speech with the noise bases from prior learning; these noise bases are obtained from training using the conventional non-negative matrix factorization. The divergence function with a sparseness constraint term is selected as the objective function in the developed algorithm to obtain multiplicative update rules of the phoneme base matrix and the corresponding weight matrix. The weights of the noise bases from prior learning are also updated in the phoneme learning stage. Listening experiments were conducted to assess the intelligibility performance of speech synthesized using the proposed algorithm. The experimental results indicate that the proposed algorithm is very effective for improving the intelligibility of whispers in various noise contexts, and it outperforms conventional algorithms.