Towards a New Image-Based Spectrogram Segmentation Speech Coder Optimised for Intelligibility

  • Authors:
  • Keith A. Jellyman;Nicholas W. Evans;W. M. Liu;J. S. Mason

  • Affiliations:
  • School of Engineering, Swansea University, UK;School of Engineering, Swansea University, UK and EURECOM, Sophia Antipolis, France;School of Engineering, Swansea University, UK;School of Engineering, Swansea University, UK

  • Venue:
  • MMM '09 Proceedings of the 15th International Multimedia Modeling Conference on Advances in Multimedia Modeling
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Speech intelligibility is the very essence of communications. When high noise can degrade a speech signal to the threshold of intelligibility, for example in mobile and military applications, introducing further degradation by a speech coder could prove critical. This paper investigates concepts towards a new speech coder that draws upon the field of image processing in a new multimedia approach. The coder is based on a spectrogram segmentation image processing procedure. The design criterion is for minimal intelligibility loss in high noise, as opposed to the conventional quality criterion, and the bit rate must be reasonable. First phase intelligibility listening test results assessing its potential alongside six standard coders are reported. Experimental results show the robustness of the LD-CELP coder, and the potential of the new coder with particularly good results in car noise conditions below -4.0dB.