Video fingerprinting using Latent Dirichlet Allocation and facial images

Authors:
Nicholas Vretos;Nikos Nikolaidis;Ioannis Pitas
Affiliations:
Department of Informatics, Aristotle University of Thessaloniki, Thessaloniki 54124, Greece;Department of Informatics, Aristotle University of Thessaloniki, Thessaloniki 54124, Greece;Department of Informatics, Aristotle University of Thessaloniki, Thessaloniki 54124, Greece
Venue:
Pattern Recognition
Year:
2012

Citing 23
Cited 0

Learning in graphical models

Learning in graphical models
An Introduction to Variational Methods for Graphical Models

Machine Learning
Feature Extraction and a Database Strategy for Video Fingerprinting

VISUAL '02 Proceedings of the 5th International Conference on Recent Advances in Visual Information Systems
Object Recognition from Local Scale-Invariant Features

ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
Latent dirichlet allocation

The Journal of Machine Learning Research
Matching words and pictures

The Journal of Machine Learning Research
On image auto-annotation with latent space models

MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
Robust Real-Time Face Detection

International Journal of Computer Vision
Improved robustness of signature-based near-replica detection via lexicon randomization

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Multimodal Video Indexing: A Review of the State-of-the-art

Multimedia Tools and Applications
A Bayesian Hierarchical Model for Learning Natural Scene Categories

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
Learning Object Categories from Google"s Image Search

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
LDA-based document models for ad-hoc retrieval

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Using Multiple Segmentations to Discover Objects and their Extent in Image Collections

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Fast Similarity Search for High-Dimensional Dataset

ISM '06 Proceedings of the Eighth IEEE International Symposium on Multimedia
Unsupervised content-based indexing of sports video

Proceedings of the international workshop on Workshop on multimedia information retrieval
LDA-Based Retrieval Framework for Semantic News Video Retrieval

ICSC '07 Proceedings of the International Conference on Semantic Computing
Robust video fingerprinting based on symmetric pairwise boosting

IEEE Transactions on Circuits and Systems for Video Technology
Perceptual image hashing based on virtual watermark detection

IEEE Transactions on Image Processing
Robust video hashing based on radial projections of key frames

IEEE Transactions on Signal Processing - Part II
Robust and Secure Image Hashing via Non-Negative Matrix Factorizations

IEEE Transactions on Information Forensics and Security - Part 1
Spatiotemporal sequence matching for efficient video copy detection

IEEE Transactions on Circuits and Systems for Video Technology
Robust Video Fingerprinting for Content-Based Video Identification

IEEE Transactions on Circuits and Systems for Video Technology

Quantified Score

Hi-index	0.01

Visualization

Abstract

This paper investigates the possibility of extracting latent aspects of a video in order to develop a video fingerprinting framework. Semantic visual information about humans, more specifically face occurrences in video frames, along with a generative probabilistic model, namely the Latent Dirichlet Allocation (LDA), are used for this purpose. The latent variables, namely the video topics are modeled as a mixture of distributions of faces in each video. The method also involves a clustering approach based on Scale Invariant Features Transform (SIFT) for clustering the detected faces and adapts the bag-of-words concept into a bag-of-faces one, in order to ensure exchangeability between topics distributions. Experimental results, on three different data sets, provide low misclassification rates of the order of 2% and false rejection rates of 0%. These rates provide evidence that the proposed method performs very efficiently for video fingerprinting.