Perceptual audio hashing algorithm based on Zernike moment and maximum-likelihood watermark detection

  • Authors:
  • Ning Chen;Hai-Dong Xiao

  • Affiliations:
  • School of Information Science & Engineering, East China University of Science and Technology, 200237, China;Sino-US Global Logistics Institute, Antai College of Economics & Management, Shanghai Jiao Tong University, 200030, China

  • Venue:
  • Digital Signal Processing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

A new perceptual audio hashing algorithm based on maximum-likelihood watermarking detection is proposed in this paper. The idea is justified by the fact that the maximum-likelihood watermark detector responds similarly to perceptually close audio using a non-embedded watermark (i.e. virtual watermark). The feature vector, which is composed of the total amplitude of low-order Zernike moments of each audio frame, is modeled by the Gaussian or Rayleigh distribution. Then, the maximum-likelihood watermark detection is performed on the feature vector with the virtual watermarks generated by pseudo-random number generator to construct the hash vector. Extensive experiments over three large audio databases of different type (speech, instrumental music, and sung voice) demonstrate the efficiency of the proposed scheme in terms of discrimination, perceptual robustness and identification rate. It is also verified that the proposed scheme outperforms state-of-the-art techniques in perceptual robustness and can be applied in content-based search, successfully.