VoIP speech quality estimation in a mixed context with genetic programming

Authors:
Adil Raja;R. Muhammad Atif Azad;Colin Flanagan;Conor Ryan
Affiliations:
University of Limerick, Limerick, Ireland;University of Limerick, Limerick, Ireland;University of Limerick, Limerick, Ireland;University of Limerick, Limerick, Ireland
Venue:
Proceedings of the 10th annual conference on Genetic and evolutionary computation
Year:
2008

Citing 9
Cited 0

Genetic programming: on the programming of computers by means of natural selection

Genetic programming: on the programming of computers by means of natural selection
Lexicographic Parsimony Pressure

GECCO '02 Proceedings of the Genetic and Evolutionary Computation Conference
Scaled Symbolic Regression

Genetic Programming and Evolvable Machines
Speech Quality of VoIP: Assessment and Prediction

Speech Quality of VoIP: Assessment and Prediction
Non-intrusive quality evaluation of VoIP using genetic programming

Proceedings of the 1st international conference on Bio inspired models of network, information and computing systems
Improving symbolic regression with interval arithmetic and linear scaling

EuroGP'03 Proceedings of the 6th European conference on Genetic programming
Real-time, non-intrusive evaluation of VoIP

EuroGP'07 Proceedings of the 10th European conference on Genetic programming
Impairment Factor Framework for Wide-Band Speech Codecs

IEEE Transactions on Audio, Speech, and Language Processing
Voice quality prediction models and their application in VoIP networks

IEEE Transactions on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

Voice over IP (VoIP) speech quality estimation is crucial to providing optimal Quality of Service (QoS). This paper seeks to provide improved speech quality estimation models with better prediction accuracy by considering a richer set of input features than the current International Telecommunications Union-Telecommunication (ITU-T) recommendations. It addresses a transitional phase, where wideband (WB) networks are becoming available. However, they have to co-exist with the existing narrowband (NB) setups for the time being. Quality estimation becomes a challenge in such a mixed context. The ITU-T recommendation (termed E-Model) has recently been extended to deal with the mixed context. However, it evaluates the speech degradation in the WB scenario based solely on codec related distortions (only a subset of factors affecting the speech quality on a VoIP network). The extension is derived out of speech signals evaluated by human subjects: an expensive and difficult to reproduce exercise. This paper innovates by considering a number of other network distortion types as well to produce generalised models that predict the quality degradation to a higher accuracy. To this end, an extensive set of speech samples is subjected to a wide variety of distortions. The degraded signals are evaluated by the currently best available algorithmic approximation of human evaluation of speech to produce quality scores. Using the distortions as the input features and targeting the quality scores, we employ Genetic Programming to produce parsimonious models that show considerable prediction gain compared to the E-Model. As against some existing approaches, where the models are tailored to various telephony codecs, the evolved models generalise across a variety of modern codecs.