Word n-grams for cluster keyboards

  • Authors:
  • Nils Klarlund;Michael Riley

  • Affiliations:
  • AT&T Labs-Research, Florham Park, NJ;AT&T Labs-Research, Florham Park, NJ

  • Venue:
  • TextEntry '03 Proceedings of the 2003 EACL Workshop on Language Modeling for Text Entry Methods
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

A cluster keyboard partitions the letters of the alphabet onto subset keys. On such keyboards most words are typed with no more key presses than on the standard keyboard, but a key sequence may stand for two or more words. In current practice, this ambiguity problem is addressed by hypothesizing words according to their unigram (occurrence) frequency. When the hypothesized word is not the intended one, an error arises. In this paper, we study the effect of deploying large, n-gram language models used in speech recognition for improving the error rate. We use the North American Business News (NAB) corpus, which contains hundreds of millions of words. We report on results for the telephone keypad and for cluster keyboards with 5, 8, 10, and 14 keys based on the QWERTY layout. Despite our assumption that a word hypothesis must be displayed promptly, we show that the error rate can be reduced to up to one-fourth of the rate of the unigram method.