Inferring Human Interactions in Meetings: A Multimodal Approach

  • Authors:
  • Zhiwen Yu;Zhiyong Yu;Yusa Ko;Xingshe Zhou;Yuichi Nakamura

  • Affiliations:
  • School of Computer Science, Northwestern Polytechnical University, P.R. China;School of Computer Science, Northwestern Polytechnical University, P.R. China;Academic Center for Computing and Media Studies, Kyoto University, Japan;School of Computer Science, Northwestern Polytechnical University, P.R. China;Academic Center for Computing and Media Studies, Kyoto University, Japan

  • Venue:
  • UIC '09 Proceedings of the 6th International Conference on Ubiquitous Intelligence and Computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Social dynamics, such as human interaction is important for understanding how a conclusion was reached in a meeting and determining whether the meeting was well organized. In this paper, a multimodal approach is proposed to infer human semantic interactions in meeting discussions. The human interaction, such as proposing an idea, giving comments, expressing a positive opinion, etc., implies user role, attitude, or intention toward a topic. Our approach infers human interactions based on a variety of audiovisual and high-level features, e.g., gestures, attention, speech tone, speaking time, interaction occasion, and information about the previous interaction. Four different inference models including Support Vector Machine (SVM), Bayesian Net, Naïve Bayes, and Decision Tree are selected and compared in human interaction recognition. Our experimental results show that SVM outperforms other inference models, we can successfully infer human interactions with a recognition rate around 80%, and our multimodal approach achieves robust and reliable results by leveraging on the characteristics of each single modality.