The analysis of the simplification from the ideal ratio to binary mask in signal-to-noise ratio sense

  • Authors:
  • Shan Liang;Wenju Liu;Wei Jiang;Wei Xue

  • Affiliations:
  • -;-;-;-

  • Venue:
  • Speech Communication
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

For speech separation systems, the ideal binary mask (IBM) can be viewed as a simplified goal of the ideal ratio mask (IRM) which is derived from Wiener filter. The available research usually verify the rationality of this simplification from the aspect of speech intelligibility. However, the difference between the two masks has not been addressed rigorously in the signal-to-noise ratio (SNR) sense. In this paper, we analytically investigate the difference between the two ideal masks under the assumption of the approximate W-Disjoint Orthogonality (AWDO) which almost holds under many kinds of interference due to the sparse nature of speech. From the analysis, one theoretical upper bound of the difference is obtained under the AWDO assumption. Some other interesting discoveries include a new ratio mask which achieves higher SNR gains than the IRM and the essential relation between the AWDO degree and the SNR gain of the IRM.