Turning pervasive computing into mediated spaces
IBM Systems Journal
Optical Flow Constraints on Deformable Models with Applications to Face Tracking
International Journal of Computer Vision
Extraction of Visual Features for Lipreading
IEEE Transactions on Pattern Analysis and Machine Intelligence
An Approach to Robust and Fast Locating Lip Motion
ICMI '00 Proceedings of the Third International Conference on Advances in Multimodal Interfaces
Effective Tracking through Tree-Search
IEEE Transactions on Pattern Analysis and Machine Intelligence
A Probabilistic Dynamic Contour Model for Accurate and Robust Lip Tracking
ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Statistical lip-appearance models trained automatically using audio information
EURASIP Journal on Applied Signal Processing
Spoken Word Recognition from Side of Face Using Infrared Lip Movement Sensor
PIT '08 Proceedings of the 4th IEEE tutorial and research workshop on Perception and Interactive Technologies for Speech-Based Systems: Perception in Multimodal Dialogue Systems
Audiovisual-to-articulatory inversion
Speech Communication
Refining face tracking with integral projections
AVBPA'03 Proceedings of the 4th international conference on Audio- and video-based biometric person authentication
Lip contour segmentation using kernel methods and level sets
ISVC'07 Proceedings of the 3rd international conference on Advances in visual computing - Volume Part II
Robust lip contour extraction using separability of multi-dimensional distributions
FGR' 04 Proceedings of the Sixth IEEE international conference on Automatic face and gesture recognition
Attractor-Guided particle filtering for lip contour tracking
ACCV'06 Proceedings of the 7th Asian conference on Computer Vision - Volume Part I
Lip localization based on active shape model and gaussian mixture model
PSIVT'06 Proceedings of the First Pacific Rim conference on Advances in Image and Video Technology
Hi-index | 0.00 |
Human speech is inherently multi-modal, consisting of both audio and visual components. Recently researchers have shown that the incorporation of information about the position of the lips into acoustic speech recognisers enables robust recognition of noisy speech. In the case of Hidden Markov Model-recognition, we show that his happens because the visual signal stabilises the alignment of states. It is also shown that unadorned lips, both the inner and outer contours, can be robustly tracked in real time on general-purpose workstations. To accomplish this, efficient algorithms are employed which contain three key components: shape models, motion models, and focused colour feature detectors 驴 all of which are learnt from examples.