High-Level Feature Extraction Using SIFT GMMs and Audio Models

Authors:
Nakamasa Inoue;Tatsuhiko Saito;Koichi Shinoda;Sadaoki Furui
Affiliations:
-;-;-;-
Venue:
ICPR '10 Proceedings of the 2010 20th International Conference on Pattern Recognition
Year:
2010

Citing 0
Cited 2

A fast MAP adaptation technique for gmm-supervector-based video semantic indexing systems

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Multimodal video concept detection via bag of auditory words and multiple kernel learning

MMM'12 Proceedings of the 18th international conference on Advances in Multimedia Modeling

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a statistical framework for high-level feature extraction that uses SIFT Gaussian mixture models (GMMs) and audio models. SIFT features were extracted from all the image frames and modeled by a GMM. In addition, we used mel-frequency cepstral coefficients and ergodic hidden Markov models to detect high-level features in audio streams. The best result obtained by using SIFT GMMs in terms of mean average precision on the TRECVID 2009 corpus was 0.150 and was improved to 0.164 by using audio information.