A multimodal temporal panorama approach for moving vehicle detection, reconstruction and classification

Authors:
Tao Wang;Zhigang Zhu;Clark N. Taylor
Affiliations:
Department of Computer Science, Graduate Center, The City University of New York, United States;Department of Computer Science, City College, Graduate Center, The City University of New York, United States;Air Force Research Laboratory, 2241 Avionics Circle, WPAFB, OH, United States
Venue:
Computer Vision and Image Understanding
Year:
2013

Citing 14
Cited 0

Fundamentals of speech recognition

Fundamentals of speech recognition
Support-Vector Networks

Machine Learning
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Comparison of different implementations of MFCC

Journal of Computer Science and Technology
Mosaic Generation for Under Vehicle Inspection

WACV '02 Proceedings of the Sixth IEEE Workshop on Applications of Computer Vision
Histograms of Oriented Gradients for Human Detection

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Object Class Recognition Using Multiple Layer Boosting with Heterogeneous Features

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
Feature subset selection in large dimensionality domains

Pattern Recognition
Active stereo vision for improving long range hearing using a Laser Doppler Vibrometer

WACV '11 Proceedings of the 2011 IEEE Workshop on Applications of Computer Vision (WACV)
Multimodal Temporal Panorama for Moving Vehicle Detection and Reconstruction

ISM '11 Proceedings of the 2011 IEEE International Symposium on Multimedia
Real time moving vehicle detection and reconstruction for improving classification

WACV '12 Proceedings of the 2012 IEEE Workshop on the Applications of Computer Vision
Detection and classification of vehicles

IEEE Transactions on Intelligent Transportation Systems
Automatic traffic surveillance system for vehicle tracking and classification

IEEE Transactions on Intelligent Transportation Systems
Audio-Visual Event Recognition in Surveillance Video Sequences

IEEE Transactions on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

Moving vehicle detection and classification using multimodal data is a challenging task in data collection, audio-visual alignment, data labeling and feature selection under uncontrolled environments with occlusions, motion blurs, varying image resolutions and perspective distortions. In this work, we propose an effective multimodal temporal panorama approach for moving vehicle detection and classification using a novel long-range audio-visual sensing system. A new audio-visual vehicle (AVV) dataset is created, which features automatic vehicle detection and audio-visual alignment, accurate vehicle extraction and reconstruction, and efficient data labeling. In particular, vehicles' visual images are reconstructed once detected in order to remove most of the occlusions, motion blurs, and variations of perspective views. Multimodal audio-visual features are extracted, including global geometric features (aspect ratios, profiles), local structure features (HOGs), as well various audio features (MFCCs, etc.). Using radial-based SVMs, the effectiveness of the integration of these multimodal features is thoroughly and systematically studied. The concept of MTP may not be only limited to visual, motion and audio modalities; it could also be applicable to other sensing modalities that can obtain data in the temporal domain.