Perceptual coding of audio signals using adaptive time-frequency transform

  • Authors:
  • Karthikeyan Umapathy;Sridhar Krishnan

  • Affiliations:
  • Department of Electrical and Computer Engineering, Ryerson University, Toronto, ON, Canada;Department of Electrical and Computer Engineering, Ryerson University, Toronto, ON, Canada

  • Venue:
  • EURASIP Journal on Audio, Speech, and Music Processing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Wide band digital audio signals have a very high data-rate associated with them due to their complex nature and demand for high-quality reproduction. Although recent technological advancements have significantly reduced the cost of bandwidth and miniaturized storage facilities, the rapid increase in the volume of digital audio content constantly compels the need for better compression algorithms. Over the years various perceptually lossless compression techniques have been introduced, and transform-based compression techniques have made a significant impact in recent years. In this paper, we propose one such transform-based compression technique, where the joint time-frequency (TF) properties of the nonstationary nature of the audio signals were exploited in creating a compact energy representation of the signal in fewer coefficients. The decomposition coefficients were processed and perceptually filtered to retain only the relevant coefficients. Perceptual filtering (psychoacoustics) was applied in a novel way by analyzing and performing TF specific psychoacoustics experiments. An added advantage of the proposed technique is that, due to its signal adaptive nature, it does not need predetermined segmentation of audio signals for processing. Eight stereo audio signal samples of different varieties were used in the study. Subjective (mean opinion score--MOS) listening tests were performed and the subjective difference grades (SDG) were used to compare the performance of the proposed coder with MP3, AAC, and HE-AAC encoders. Compression ratios in the range of 8 to 40 were achieved by the proposed technique with subjective difference grades (SDG) ranging from -0.53 to -2.27.