Speak-as-you-swipe (SAYS): a multimodal interface combining speech and gesture keyboard synchronously for continuous mobile text entry

Authors:
Khe Chai Sim
Affiliations:
National University of Singapore, Singapore, Singapore
Venue:
Proceedings of the 14th ACM international conference on Multimodal interaction
Year:
2012

Citing 5
Cited 1

Assessment for automatic speech recognition II: NOISEX-92: a database and an experiment to study the effect of additive noise on speech recognition systems

Speech Communication - Special issue on speech processing in adverse conditions
Look-Ahead Techniques for Fast Beam Search

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 3 - Volume 3
SHARK2: a large vocabulary shorthand writing system for pen-based computers

Proceedings of the 17th annual ACM symposium on User interface software and technology
An empirical study of typing rates on mini-QWERTY keyboards

CHI '05 Extended Abstracts on Human Factors in Computing Systems
Probabilistic integration of partial lexical information for noise robust haptic voice recognition

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1

SpeeG2: a speech- and gesture-based interface for efficient controller-free text input

Proceedings of the 15th ACM on International conference on multimodal interaction

Quantified Score

Hi-index	0.00

Visualization

Abstract

Modern mobile devices, such as the smartphones and tablets, are becoming increasingly popular amongst users of all ages. Text entry is one of the most important modes of interaction between human and their mobile devices. Although typing on a touchscreen display using a soft keyboard remains the most common text input method for many users, the process can be frustratingly slow, especially on smartphones with a much smaller screen. Voice input offers an attractive alternative that completely eliminates the need for typing. However, voice input relies on automatic speech recognition technology whose performance degrades significantly in noisy environment or for non-native users. This paper presents Speak-As-You-Swipe (SAYS), a novel multimodal interface that enables efficient continuous text entry on mobile devices. SAYS integrates a gesture keyboard with speech recognition to improve the efficiency and accuracy of text entry. The swipe gesture and voice inputs provide complementary information that can be very effective in disambiguating confusions in word predictions. The word prediction hypotheses from a gesture keyboard are directly incorporated into the speech recognition process so that the SAYS interface can handle continuous input. Experimental results show that for a 20k vocabulary, the proposed SAYS interface can achieve prediction accuracy of 96.4% in clean condition and about 94.0% in noisy environment, compared to 92.2% using a gesture keyboard alone.