A software framework for multimodal human-computer interaction systems

Authors:
Jie Shen;Maja Pantic
Affiliations:
Department of Computing, Imperial College London, London, U.K.;Department of Computing, Imperial College London, U.K. and EEMCS, Univ. Twente, N.L.
Venue:
SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Year:
2009

Citing 5
Cited 0

Robust Real-Time Face Detection

International Journal of Computer Vision
Multimodal human-computer interaction: A survey

Computer Vision and Image Understanding
Human-Centred Intelligent Human Computer Interaction (HCI²): how far are we from attaining it?

International Journal of Autonomous and Adaptive Communications Systems
Human computing and machine understanding of human behavior: a survey

ICMI'06/IJCAI'07 Proceedings of the ICMI 2006 and IJCAI 2007 international conference on Artifical intelligence for human computing
Gaze-X: adaptive, affective, multimodal interface for single-user office scenarios

ICMI'06/IJCAI'07 Proceedings of the ICMI 2006 and IJCAI 2007 international conference on Artifical intelligence for human computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a software framework we designed and implemented for the development and research in the area of multimodal human-computer interface. The proposed framework is based on publish / subscribe architecture, which allows developers and researchers to conveniently configure, test and expand their system in a modular and incremental manner. In order to achieve reliable and efficient data transport between modules while still providing a high degree of system flexibility, the framework uses a shared-memory based data transport protocol for message delivery together with a TCP based system management protocol to maintain the integrity of system structure at runtime. The framework is delivered as a communication middleware, providing a basic system manager and well-documented, easy-to-use and open source C++ SDKs supporting both module development and server extension. The experimental comparison between the proposed framework and other similar tools available to the community indicates that our framework greatly outperforms the others in terms of average message latency, maximum data throughput and CPU consumption level, especially in heavy workload scenarios. To demonstrate the performance of our framework in real world applications, we have built a demo system which is used to detect faces and facial feature points in real-time captured video. The result shows our framework is capable of delivering some tens of megabytes of data per second effectively and efficiently even under tight resource constraint.