Accurate, Real-Time, Unadorned Lip Tracking

Authors:
Robert Kaucic;Andrew Blake
Affiliations:
-;-
Venue:
ICCV '98 Proceedings of the Sixth International Conference on Computer Vision
Year:
1998

Citing 0
Cited 14

Turning pervasive computing into mediated spaces

IBM Systems Journal
Optical Flow Constraints on Deformable Models with Applications to Face Tracking

International Journal of Computer Vision
Extraction of Visual Features for Lipreading

IEEE Transactions on Pattern Analysis and Machine Intelligence
An Approach to Robust and Fast Locating Lip Motion

ICMI '00 Proceedings of the Third International Conference on Advances in Multimodal Interfaces
Effective Tracking through Tree-Search

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Probabilistic Dynamic Contour Model for Accurate and Robust Lip Tracking

ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Statistical lip-appearance models trained automatically using audio information

EURASIP Journal on Applied Signal Processing
Spoken Word Recognition from Side of Face Using Infrared Lip Movement Sensor

PIT '08 Proceedings of the 4th IEEE tutorial and research workshop on Perception and Interactive Technologies for Speech-Based Systems: Perception in Multimodal Dialogue Systems
Audiovisual-to-articulatory inversion

Speech Communication
Refining face tracking with integral projections

AVBPA'03 Proceedings of the 4th international conference on Audio- and video-based biometric person authentication
Lip contour segmentation using kernel methods and level sets

ISVC'07 Proceedings of the 3rd international conference on Advances in visual computing - Volume Part II
Robust lip contour extraction using separability of multi-dimensional distributions

FGR' 04 Proceedings of the Sixth IEEE international conference on Automatic face and gesture recognition
Attractor-Guided particle filtering for lip contour tracking

ACCV'06 Proceedings of the 7th Asian conference on Computer Vision - Volume Part I
Lip localization based on active shape model and gaussian mixture model

PSIVT'06 Proceedings of the First Pacific Rim conference on Advances in Image and Video Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Human speech is inherently multi-modal, consisting of both audio and visual components. Recently researchers have shown that the incorporation of information about the position of the lips into acoustic speech recognisers enables robust recognition of noisy speech. In the case of Hidden Markov Model-recognition, we show that his happens because the visual signal stabilises the alignment of states. It is also shown that unadorned lips, both the inner and outer contours, can be robustly tracked in real time on general-purpose workstations. To accomplish this, efficient algorithms are employed which contain three key components: shape models, motion models, and focused colour feature detectors 驴 all of which are learnt from examples.