Salience modeling based on non-verbal modalities for spoken language understanding

  • Authors:
  • Shaolin Qu;Joyce Y. Chai

  • Affiliations:
  • Michigan State University, East Lansing, MI;Michigan State University, East Lansing, MI

  • Venue:
  • Proceedings of the 8th international conference on Multimodal interfaces
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Previous studies have shown that, in multimodal conversational systems, fusing information from multiple modalities together can improve the overall input interpretation through mutual disambiguation. Inspired by these findings, this paper investigates non-verbal modalities, in particular deictic gesture, in spoken language processing. Our assumption is that during multimodal conversation, user's deictic gestures on the graphic display can signal the underlying domain model that is salient at that particular point of interaction. This salient domain model can be used to constrain hypotheses for spoken language processing. Based on this assumption, this paper examines different configurations of salience driven language models (e.g., n-gram and probabilistic context free grammar) for spoken language processing across different stages. Our empirical results have shown the potential of integrating salience models based on non-verbal modalities in spoken language understanding.