Is that scene dangerous?: transferring knowledge over a video stream

  • Authors:
  • Omar U. Florez;Curtis Dyreson

  • Affiliations:
  • Utah State University, Logan, UT, USA;Utah State University, Logan, UT, USA

  • Venue:
  • Proceedings of the 5th Ph.D. workshop on Information and knowledge
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Activity mining in traffic scenes aims to automatically explain the complex interactions among moving objects recorded with a surveillance camera. Traditional machine learning algorithms generate a model and validate it with manually labeled data, which is a time-consuming and expensive task. The common issue is that these models often get outdated when external variables take place during posterior recording such as dynamic background, illumination, and different weather conditions. Those changes practically impose a new domain that often makes the original model inaccurate for clustering and classification tasks. If we directly apply a statistical model trained in one domain to other over the same stream, the performance of the algorithm will notably decrease due to distinct activity representations and different marginal and conditional distributions. We approach this problem in two stages: 1) we present mature results on a hierarchical Bayesian model designed to represent every video scene as a multinomial distribution over topics. 2) we present early stage evidence of an algorithm to transfer knowledge across two instances of the hierarchical model described in the previous stage. A concrete example of this first stage consists of a simple (but efficient) algorithm to incrementally generate association rules to explain current traffic scenes as co-occurrence relationships between topics. This approach is especially useful when we do not have any labels in a target domain, but have some labeled information (which frames contain dangerous scenes?) in a source domain, by far the most frequent case in real surveillance systems. This algorithm clusters domain-dependent activities in the latent space and bridge them across domains via domain-independent activities. Our experiments show that our method is able to successfully compete with SVM to perform generalization when the temporal gap between source and target domain is large.