Perceptual speech quality measures separating speech distortion and additive noise degradations

  • Authors:
  • Anis Ben Aicha;Sofia Ben Jebara

  • Affiliations:
  • Ecole Supérieure des Communications de Tunis, Research unit TECHTRA, University of Carthage, Route de Raoued 3.5 Km, Cité El Ghazala, Ariana 2083, Tunisia;Ecole Supérieure des Communications de Tunis, Research unit TECHTRA, University of Carthage, Route de Raoued 3.5 Km, Cité El Ghazala, Ariana 2083, Tunisia

  • Venue:
  • Speech Communication
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, novel perceptual criteria measuring speech distortion, additive noise and the overall quality are presented. Based on the masking concept, they are built to measure only the audible degradations perceived by the human ear. The class of perceptual equivalence (CPE) is introduced which leads to specify the nature of degradations affecting denoised speech. The CPE is defined in the frequency domain using perceptual tools and limited by two curves : upper bound of perceptual equivalence (UBPE) and lower bound of perceptual equivalence (LBPE). Denoised speech components belonging to this class are perceptually equivalent to the clean speech components, otherwise audible degradations are noticed. Based on this concept, new perceptual criteria are developed to assess denoised speech signals. After criteria introduction and explanation, they are validated by comparing their relationship, in terms of scatter plots and Pearson correlation with ITU-T recommendation P.835 which specifies three subjective tests to evaluate independently the speech distortion (SIG), the residual background noise (BAK) and the overall quality (MOS). Moreover, proposed criteria are compared conventional criteria, indicating an improved ability for predicting subjective tests.