A tutorial on hidden Markov models and selected applications in speech recognition
Readings in speech recognition
Probabilistic Networks and Expert Systems
Probabilistic Networks and Expert Systems
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Cross-modal correlation learning for clustering on image-audio dataset
Proceedings of the 15th international conference on Multimedia
Hidden Conditional Random Fields
IEEE Transactions on Pattern Analysis and Machine Intelligence
A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions
IEEE Transactions on Pattern Analysis and Machine Intelligence
Interactive relevance search and modeling: support for expert-driven analysis of multimodal data
Proceedings of the 15th ACM on International conference on multimodal interaction
Inferring social activities with mobile sensor networks
Proceedings of the 15th ACM on International conference on multimodal interaction
Review Article: Multimodal interaction: A review
Pattern Recognition Letters
Hi-index | 0.00 |
Multimodal human behavior analysis is a challenging task due to the presence of complex nonlinear correlations and interactions across modalities. We present a novel approach to this problem based on Kernel Canonical Correlation Analysis (KCCA) and Multi-view Hidden Conditional Random Fields (MV-HCRF). Our approach uses a nonlinear kernel to map multimodal data to a high-dimensional feature space and finds a new projection of the data that maximizes the correlation across modalities. We use a multi-chain structured graphical model with disjoint sets of latent variables, one set per modality, to jointly learn both view-shared and view-specific sub-structures of the projected data, capturing interaction across modalities explicitly. We evaluate our approach on a task of agreement and disagreement recognition from nonverbal audio-visual cues using the Canal 9 dataset. Experimental results show that KCCA makes capturing nonlinear hidden dynamics easier and MV-HCRF helps learning interaction across modalities.