Audio query by example using similarity measures between probability density functions of features

Authors:
Marko Helén;Tuomas Virtanen
Affiliations:
Department of Signal Processing, Tampere University of Technology, Tampere, Finland;Department of Signal Processing, Tampere University of Technology, Tampere, Finland
Venue:
EURASIP Journal on Audio, Speech, and Music Processing - Special issue on scalable audio-content analysis
Year:
2010

Citing 14
Cited 3

Approximate Nearest Neighbor Searching in Multimedia Databases

Proceedings of the 17th International Conference on Data Engineering
Noise Robust Speech Recognition with State Duration Constraints

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Image Retrieval by Positive and Negative Examples

ICPR '00 Proceedings of the International Conference on Pattern Recognition - Volume 4
An Efficient Image Similarity Measure Based on Approximations of KL-Divergence Between Two Gaussian Mixtures

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Integrating Hidden Markov Models and Spectral Analysis for Sensory Time Series Clustering

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Psychoacoustics: Facts and Models

Psychoacoustics: Facts and Models
Query by Example in Large Databases Using Key-Sample Distance Transformation and Clustering

ISMW '07 Proceedings of the Ninth IEEE International Symposium on Multimedia Workshops
Product of Gaussians for speech recognition

Computer Speech and Language
Similarity searching in image retrieval with statistical distance measures and supervised learning

ICAPR'05 Proceedings of the Third international conference on Advances in Pattern Recognition - Volume Part I
Audio-based context recognition

IEEE Transactions on Audio, Speech, and Language Processing
A generic audio classification and segmentation approach for multimedia indexing and retrieval

IEEE Transactions on Audio, Speech, and Language Processing
Classification of musical patterns using variable duration hidden Markov models

IEEE Transactions on Audio, Speech, and Language Processing
A quick search method for audio and video signals based on histogram pruning

IEEE Transactions on Multimedia

Comparison of methods for language-dependent and language-independent query-by-example spoken term detection

ACM Transactions on Information Systems (TOIS)
Audio Classification and Retrieval Using Wavelets and Gaussian Mixture Models

International Journal of Multimedia Data Engineering & Management
Robust Bayesian fitting of 3D morphable model

Proceedings of the 10th European Conference on Visual Media Production

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a query by example system for generic audio. We estimate the similarity of the example signal and the samples in the queried database by calculating the distance between the probability density functions (pdfs) of their frame-wise acoustic features. Since the features are continuous valued, we propose to model them using Gaussian mixture models (GMMs) or hidden Markov models (HMMs). The models parametrize each sample efficiently and retain sufficient information for similarity measurement. To measure the distance between the models, we apply a novel Euclidean distance, approximations of Kullback-Leibler divergence, and a cross-likelihood ratio test. The performance of the measures was tested in simulations where audio samples are automatically retrieved from a general audio database, based on the estimated similarity to a user-provided example. The simulations show that the distance between probability density functions is an accurate measure for similarity. Measures based on GMMs or HMMs are shown to produce better results than that of the existing methods based on simpler statistics or histograms of the features. A good performance with low computational cost is obtained with the proposed Euclidean distance.