Anthropomorphic coding of speech and audio: a model inversion approach

Authors:
Christian Feldbauer;Gernot Kubin;W. Bastiaan Kleijn
Affiliations:
Signal Processing and Speech Communication Laboratory, Graz University of Technology, Graz, Austria;Signal Processing and Speech Communication Laboratory, Graz University of Technology, Graz, Austria;Department for Signals, Sensors and Systems, KTH, Stockholm, Sweden
Venue:
EURASIP Journal on Applied Signal Processing
Year:
2005

Citing 7
Cited 1

Pulsed neural networks

Pulsed neural networks
Using a Quantitative Psychoacoustical Signal Representation for Objective Speech Quality Measurement

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Exact reconstruction from periodic nonuniform samples

ICASSP '95 Proceedings of the Acoustics, Speech, and Signal Processing, 1995. on International Conference - Volume 02
Psychoacoustics: Facts and Models

Psychoacoustics: Facts and Models
On speech coding in a perceptual domain

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Wideband speech and audio coding using gammatone filter banks

ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
Frame-theoretic analysis of oversampled filter banks

IEEE Transactions on Signal Processing

Auditory-inspired sparse representation of audio signals

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

Auditory modeling is a well-established methodology that provides insight into human perception and that facilitates the extraction of signal features that are most relevant to the listener. The aim of this paper is to provide a tutorial on perceptual speech and audio coding using an invertible auditory model. In this approach, the audio signal is converted into an auditory representation using an invertible auditory model. The auditory representation is quantized and coded. Upon decoding, it is then transformed back into the acoustic domain. This transformation converts a complex distortion criterion into a simple one, thus facilitating quantization with low complexity. We briefly review past work on auditory models and describe in more detail the components of our invertible model and its inversion procedure, that is, the method to reconstruct the signal from the output of the auditory model. We summarize attempts to use the auditory representation for low-bit-rate coding. Our approach also allows the exploitation of the inherent redundancy of the human auditory system for the purpose of multiple description (joint source-channel) coding.