Towards optimal audio "keywords" detection for audio content analysis and discovery

  • Authors:
  • Lie Lu;Alan Hanjalic

  • Affiliations:
  • Microsoft Research Asia;Delft University of Technology

  • Venue:
  • MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Natural semantic sound clusters in an audio document, also referred to as audio elements, can be seen as an analogy to words in a text document. Based on the obtained set of audio elements, the key audio elements, or audio "keywords", can be detected, which are most prominent in characterizing the content of audio data. As such, they can be of great use for automatic audio content analysis and discovery. Motivated by the limitations of the existing methods for key audio element detection, we propose in this paper a novel unsupervised approach to audio elements weighting using multiple audio documents, analog to word weighting in text document analysis. In our approach, dominant feature vectors (DFV) are first extracted from each audio element, and used to measure the audio elements similarity, based on which the occurrence probability of one audio element in different audio documents can be estimated. Then, four factors, including expected term frequency, expected inverse document frequency, expected term duration, and expected inverse document duration, are calculated and combined to give the importance weight of each audio element. Evaluation of the obtained audio "keywords" and their usability for auditory scene segmentation and audio document clustering, performed on 5 hours of diverse audio data, shows highly promising results.