Multimodal model integration for sentence unit detection

  • Authors:
  • Mary P. Harper;Elizabeth Shriberg

  • Affiliations:
  • Purdue University, West Lafayette, IN;SRI International, Menlo Park, CA

  • Venue:
  • Proceedings of the 6th international conference on Multimodal interfaces
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we adopt a direct modeling approach to utilize conversational gesture cues in detecting sentence boundaries, called SUs, in video taped conversations. We treat the detection of SUs as a classification task such that for each inter-word boundary, the classifier decides whether there is an SU boundary or not. In addition to gesture cues, we also utilize prosody and lexical knowledge sources. In this first investigation, we find that gesture features complement the prosodic and lexical knowledge sources for this task. By using all of the knowledge sources, the model is able to achieve the lowest overall SU detection error rate.