Graph-based partial hypothesis fusion for pen-aided speech input

  • Authors:
  • Peng Liu;Frank K. Soong

  • Affiliations:
  • Microsoft Research Asia, Beijing, China;Microsoft Research Asia, Beijing, China

  • Venue:
  • IEEE Transactions on Audio, Speech, and Language Processing - Special issue on multimodal processing in speech-based interactions
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We study a specific partial hypothesis fusion problem in sequential data labeling. The problem arises in the multimodal applications where a decision is made by merging complete hypothesis from one input and partial hypothesis from the other. For example, in a pen-aided speech interface, appropriate pen input can provide partial but crucial information. We address the problem in a Bayesian framework, and reformulate the solution as a revised search in a representation. A dynamic programming algorithm is proposed to efficiently solve the partial hypothesis fusion via the graph. It is shown that the computational cost of the graph based partial hypothesis fusion is proportional to the size of the graph, which is highly feasible for a given compact graph. We apply the proposed algorithm to two real applications: an intelligent penbased dictation error correction system and an automatic handwritten character completion with a speech "shortcut". Experimental results show that the algorithm is effective in utilizing the partial information from one modality to enhance the bimodal interface performance.