Is that scene dangerous?: transferring knowledge over a video stream

Authors:
Omar U. Florez;Curtis Dyreson
Affiliations:
Utah State University, Logan, UT, USA;Utah State University, Logan, UT, USA
Venue:
Proceedings of the 5th Ph.D. workshop on Information and knowledge
Year:
2012

Citing 6
Cited 0

Latent dirichlet allocation

The Journal of Machine Learning Research
Unsupervised Activity Perception in Crowded and Complicated Scenes Using Hierarchical Bayesian Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
Structured correspondence topic models for mining captioned figures in biological literature

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)
Video summarization via transferrable structured learning

Proceedings of the 20th international conference on World wide web
Crowdsourcing semantic data management: challenges and opportunities

Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Activity mining in traffic scenes aims to automatically explain the complex interactions among moving objects recorded with a surveillance camera. Traditional machine learning algorithms generate a model and validate it with manually labeled data, which is a time-consuming and expensive task. The common issue is that these models often get outdated when external variables take place during posterior recording such as dynamic background, illumination, and different weather conditions. Those changes practically impose a new domain that often makes the original model inaccurate for clustering and classification tasks. If we directly apply a statistical model trained in one domain to other over the same stream, the performance of the algorithm will notably decrease due to distinct activity representations and different marginal and conditional distributions. We approach this problem in two stages: 1) we present mature results on a hierarchical Bayesian model designed to represent every video scene as a multinomial distribution over topics. 2) we present early stage evidence of an algorithm to transfer knowledge across two instances of the hierarchical model described in the previous stage. A concrete example of this first stage consists of a simple (but efficient) algorithm to incrementally generate association rules to explain current traffic scenes as co-occurrence relationships between topics. This approach is especially useful when we do not have any labels in a target domain, but have some labeled information (which frames contain dangerous scenes?) in a source domain, by far the most frequent case in real surveillance systems. This algorithm clusters domain-dependent activities in the latent space and bridge them across domains via domain-independent activities. Our experiments show that our method is able to successfully compete with SVM to perform generalization when the temporal gap between source and target domain is large.