A fully online and unsupervised system for large and high-density area surveillance: Tracking, semantic scene learning and abnormality detection

  • Authors:
  • Xuan Song;Xiaowei Shao;Quanshi Zhang;Ryosuke Shibasaki;Huijing Zhao;Jinshi Cui;Hongbin Zha

  • Affiliations:
  • The University of Tokyo, Japan;The University of Tokyo, Japan;The University of Tokyo, Japan;The University of Tokyo, Japan;Peking University, China;Peking University, China;Peking University, China

  • Venue:
  • ACM Transactions on Intelligent Systems and Technology (TIST) - Special section on agent communication, trust in multiagent systems, intelligent tutoring and coaching systems
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

For reasons of public security, an intelligent surveillance system that can cover a large, crowded public area has become an urgent need. In this article, we propose a novel laser-based system that can simultaneously perform tracking, semantic scene learning, and abnormality detection in a fully online and unsupervised way. Furthermore, these three tasks cooperate with each other in one framework to improve their respective performances. The proposed system has the following key advantages over previous ones: (1) It can cover quite a large area (more than 60×35m), and simultaneously perform robust tracking, semantic scene learning, and abnormality detection in a high-density situation. (2) The overall system can vary with time, incrementally learn the structure of the scene, and perform fully online abnormal activity detection and tracking. This feature makes our system suitable for real-time applications. (3) The surveillance tasks are carried out in a fully unsupervised manner, so that there is no need for manual labeling and the construction of huge training datasets. We successfully apply the proposed system to the JR subway station in Tokyo, and demonstrate that it can cover an area of 60×35m, robustly track more than 150 targets at the same time, and simultaneously perform online semantic scene learning and abnormality detection with no human intervention.