Tracking mean shift clustered point clouds for 3D surveillance
Proceedings of the 4th ACM international workshop on Video surveillance and sensor networks
Mutual Information-Based 3D Object Tracking
International Journal of Computer Vision
Locating nose-tips and estimating head poses in images by tensorposes
IEEE Transactions on Circuits and Systems for Video Technology
Real-time 3D face tracking with mutual information and active contours
ISVC'07 Proceedings of the 3rd international conference on Advances in visual computing - Volume Part I
3D facial pose tracking in uncalibrated videos
PReMI'05 Proceedings of the First international conference on Pattern Recognition and Machine Intelligence
Hi-index | 0.00 |
Particle filtering is a very popular technique for sequential state estimation problem. However its convergence greatly depends on the balance between the number of particles/hypotheses and the fitness of the dynamic model. In particular, in cases where the dynamics are complex or poorly modeled, thousands of particles are usually required for real applications. This paper presents a hybrid sampling solution that combines the sampling in the image feature space and in the state space via RANSAC and particle filtering, respectively. We show that the number of particles can be reduced to dozens for a full 3D tracking problem which contains considerable noise of different types. For unexpected motions, a specific set of dynamics may not exist, but it is avoided in our algorithm. The theoretical convergence proof [1, 3] for particle filtering when integrating RANSAC is difficult, but we address this problem by analyzing the likelihood distribution of particles from a real tracking example. The sampling efficiency (on the more likely areas) is much higher by the use of RANSAC. We also discuss the tracking quality measurement in the sense of entropy or statistical testing. The algorithm has been applied to the problem of 3D face pose tracking with changing moderate or intense expressions. We demonstrate the validity of our approach with several video sequences acquired in an unstructured environment.