Vision-based action recognition of earthmoving equipment using spatio-temporal features and support vector machine classifiers

Authors:
Mani Golparvar-Fard;Arsalan Heydarian;Juan Carlos Niebles
Affiliations:
Department of Civil and Environmental Engineering, Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, United States;Charles E. Via Department of Civil & Environmental Engineering, Virginia Tech, Blacksburg, VA 24061, United States;Department of Electrical and Electronic Engineering, Universidad del Norte, Barranquilla, Colombia
Venue:
Advanced Engineering Informatics
Year:
2013

Citing 24
Cited 0

Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Recognizing Action at a Distance

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Recognizing Human Actions: A Local SVM Approach

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 3 - Volume 03
Video Epitomes

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Histograms of Oriented Gradients for Human Detection

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
On Space-Time Interest Points

International Journal of Computer Vision
Actions as Space-Time Shapes

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Behavior recognition via sparse spatio-temporal features

ICCCN '05 Proceedings of the 14th International Conference on Computer Communications and Networks
Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words

International Journal of Computer Vision
Searching for Complex Human Activities with No Visual Examples

International Journal of Computer Vision
Spatial-Temporal correlatons for unsupervised action classification

WMVC '08 Proceedings of the 2008 IEEE Workshop on Motion and video Computing
Human Action Recognition by Semilatent Topic Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
Personnel tracking on construction sites using video cameras

Advanced Engineering Informatics
Object Detection with Discriminatively Trained Part-Based Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
Modeling temporal structure of decomposable motion segments for activity classification

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part II
Human activity analysis: A review

ACM Computing Surveys (CSUR)
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)
Automated vision tracking of project related entities

Advanced Engineering Informatics
A performance evaluation of vision and radio frequency tracking methods for interacting workforce

Advanced Engineering Informatics
Learning and classifying actions of construction workers and equipment using Bag-of-Video-Feature-Words and Bayesian network models

Advanced Engineering Informatics
Human detection using oriented histograms of flow and appearance

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part II
Real-time construction worker posture analysis for ergonomics training

Advanced Engineering Informatics
Articulated pose estimation with flexible mixtures-of-parts

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Learning hierarchical poselets for human parsing

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

Video recordings of earthmoving construction operations provide understandable data that can be used for benchmarking and analyzing their performance. These recordings further support project managers to take corrective actions on performance deviations and in turn improve operational efficiency. Despite these benefits, manual stopwatch studies of previously recorded videos can be labor-intensive, may suffer from biases of the observers, and are impractical after substantial period of observations. This paper presents a new computer vision based algorithm for recognizing single actions of earthmoving construction equipment. This is particularly a challenging task as equipment can be partially occluded in site video streams and usually come in wide variety of sizes and appearances. The scale and pose of the equipment actions can also significantly vary based on the camera configurations. In the proposed method, a video is initially represented as a collection of spatio-temporal visual features by extracting space-time interest points and describing each feature with a Histogram of Oriented Gradients (HOG). The algorithm automatically learns the distributions of the spatio-temporal features and action categories using a multi-class Support Vector Machine (SVM) classifier. This strategy handles noisy feature points arisen from typical dynamic backgrounds. Given a video sequence captured from a fixed camera, the multi-class SVM classifier recognizes and localizes equipment actions. For the purpose of evaluation, a new video dataset is introduced which contains 859 sequences from excavator and truck actions. This dataset contains large variations of equipment pose and scale, and has varied backgrounds and levels of occlusion. The experimental results with average accuracies of 86.33% and 98.33% show that our supervised method outperforms previous algorithms for excavator and truck action recognition. The results hold the promise for applicability of the proposed method for construction activity analysis.