Speech Communication - Special issue on speech processing in adverse conditions
Look-Ahead Techniques for Fast Beam Search
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 3 - Volume 3
SHARK2: a large vocabulary shorthand writing system for pen-based computers
Proceedings of the 17th annual ACM symposium on User interface software and technology
An empirical study of typing rates on mini-QWERTY keyboards
CHI '05 Extended Abstracts on Human Factors in Computing Systems
Probabilistic integration of partial lexical information for noise robust haptic voice recognition
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
SpeeG2: a speech- and gesture-based interface for efficient controller-free text input
Proceedings of the 15th ACM on International conference on multimodal interaction
Hi-index | 0.00 |
Modern mobile devices, such as the smartphones and tablets, are becoming increasingly popular amongst users of all ages. Text entry is one of the most important modes of interaction between human and their mobile devices. Although typing on a touchscreen display using a soft keyboard remains the most common text input method for many users, the process can be frustratingly slow, especially on smartphones with a much smaller screen. Voice input offers an attractive alternative that completely eliminates the need for typing. However, voice input relies on automatic speech recognition technology whose performance degrades significantly in noisy environment or for non-native users. This paper presents Speak-As-You-Swipe (SAYS), a novel multimodal interface that enables efficient continuous text entry on mobile devices. SAYS integrates a gesture keyboard with speech recognition to improve the efficiency and accuracy of text entry. The swipe gesture and voice inputs provide complementary information that can be very effective in disambiguating confusions in word predictions. The word prediction hypotheses from a gesture keyboard are directly incorporated into the speech recognition process so that the SAYS interface can handle continuous input. Experimental results show that for a 20k vocabulary, the proposed SAYS interface can achieve prediction accuracy of 96.4% in clean condition and about 94.0% in noisy environment, compared to 92.2% using a gesture keyboard alone.