Latent topic model for audio retrieval

Authors:
Pengfei Hu;Wenju Liu;Wei Jiang;Zhanlei Yang
Affiliations:
-;-;-;-
Venue:
Pattern Recognition
Year:
2014

Citing 5
Cited 0

An Introduction to Variational Methods for Graphical Models

Machine Learning
Content-Based Classification, Search, and Retrieval of Audio

IEEE MultiMedia
Latent dirichlet allocation

The Journal of Machine Learning Research
Unsupervised content-based indexing of sports video

Proceedings of the international workshop on Workshop on multimedia information retrieval
LDA-Based Retrieval Framework for Semantic News Video Retrieval

ICSC '07 Proceedings of the International Conference on Semantic Computing

Quantified Score

Hi-index	0.01

Visualization

Abstract

Latent topic model such as Latent Dirichlet Allocation (LDA) has been designed for text processing and has also demonstrated success in the task of audio related processing. The main idea behind LDA assumes that the words of each document arise from a mixture of topics, each of which is a multinomial distribution over the vocabulary. When applying the original LDA to process continuous data, the word-like unit need be first generated by vector quantization (VQ). This data discretization usually results in information loss. To overcome this shortage, this paper introduces a new topic model named Gaussian-LDA for audio retrieval. In the proposed model, we consider continuous emission probability, Gaussian instead of multinomial distribution. This new topic model skips the vector quantization and directly models each topic as a Gaussian distribution over audio features. It avoids discretization by this way and integrates the procedure of clustering. The experiments of audio retrieval demonstrate that Gaussian-LDA achieves better performance than other compared methods.