Fusion and fission: improved MMIA for multi-modal HCI based on WPS and voice-XML

Authors:
Jung-Hyun Kim;Kwang-Seok Hong
Affiliations:
School of Information and Communication Engineering, Sungkyunkwan University, Suwon, KyungKi-do, Korea;School of Information and Communication Engineering, Sungkyunkwan University, Suwon, KyungKi-do, Korea
Venue:
ICOST'07 Proceedings of the 5th international conference on Smart homes and health telematics
Year:
2007

Citing 4
Cited 0

The integrality of speech in multimodal interfaces

ACM Transactions on Computer-Human Interaction (TOCHI)
An implementation of KSSL recognizer for HCI based on post wearable PC and wireless networks

KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part I
An implementation of real time-sentential KSSL recognition system based on the post wearable PC

ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part IV
Hand gesture recognition system using fuzzy algorithm and RDBMS for post PC

FSKD'05 Proceedings of the Second international conference on Fuzzy Systems and Knowledge Discovery - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper implements the Multi-Modal Instruction Agent (hereinafter, MMIA) including a synchronization between audio-gesture modalities, and suggests improved fusion and fission rules depending on SNNR (Signal Plus Noise to Noise Ratio) and fuzzy value, based on the embedded KSSL (Korean Standard Sign Language) recognizer using the WPS (Wearable Personal Station) and Voice-XML. Our approach fuses and recognizes the sentence and word-based instruction models that are represented by speech and KSSL, and then translates recognition result that is fissioned according to a weight decision rule into synthetic speech and visual illustration (graphical display by HMD-Head Mounted Display) in real-time. In order to insure the validity of our approach, we evaluate performance with the average recognition rates and the recognition time of MMIA. In the experimental results, the average recognition rates of the MMIA for the prescribed 65 sentential and 156 word instruction models were 94.33% and 96.85% in clean environments, and 92.29% and 92.91% were shown in noisy environments. In addition, the average recognition time is approximately 0.36 ms in given both environments.