Non-intrusive speech quality assessment using several combinations of auditory features

Authors:
Rajesh Kumar Dubey;Arun Kumar
Affiliations:
Department of Electronics and Communication Engineering, Jaypee Institute of Information Technology (JIIT), Noida, India;CARE, Indian Institute of Technology (IIT), Delhi, India
Venue:
International Journal of Speech Technology
Year:
2013

Citing 8
Cited 0

Subjective comparison and evaluation of speech enhancement algorithms

Speech Communication
Use of Line Spectral Frequencies for Emotion Recognition from Speech

ICPR '10 Proceedings of the 2010 20th International Conference on Pattern Recognition
Non-intrusive speech quality assessment with support vector regression

MMM'10 Proceedings of the 16th international conference on Advances in Multimedia Modeling
A Multiresolution Model of Auditory Excitation Pattern and Its Application to Objective Evaluation of Perceived Speech Quality

IEEE Transactions on Audio, Speech, and Language Processing
P.563—The ITU-T Standard for Single-Ended Speech Quality Assessment

IEEE Transactions on Audio, Speech, and Language Processing
Single-Ended Speech Quality Measurement Using Machine Learning Methods

IEEE Transactions on Audio, Speech, and Language Processing
Low-Complexity, Nonintrusive Speech Quality Assessment

IEEE Transactions on Audio, Speech, and Language Processing
Evaluation of Objective Quality Measures for Speech Enhancement

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Quality estimation of speech is essential for monitoring and maintenance of the quality of service at different nodes of modern telecommunication networks. It is also required in the selection of codecs in speech communication systems. There is no requirement of the original clean speech signal as a reference in non-intrusive speech quality evaluation, and thus it is of importance in evaluating the quality of speech at any node of the communication network. In this paper, non-intrusive speech quality assessment of narrowband speech is done by Gaussian Mixture Model (GMM) training using several combinations of auditory perception and speech production features, which include principal components of Lyon's auditory model features, MFCC, LSF and their first and second differences. Results are obtained and compared for several combinations of auditory features for three sets of databases. The results are also compared with ITU-T Recommendation P.563 for non-intrusive speech quality assessment. It is found that many combinations of these feature sets outperform the ITU-T P.563 Recommendation under the test conditions.