SpeechActs: A Spoken-Language Framework

  • Authors:
  • Paul Martin;Frederick Crabbe;Stuart Adams;Eric Baatz;Nicole Yankelovich

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • Computer
  • Year:
  • 1996

Quantified Score

Hi-index 4.10

Visualization

Abstract

SpeechActs is a prototype testbed for developing spoken natural language applications. It lets software developers without special expertise in speech or natural language create applications with which users can speak naturally, as if they were conversing with a personal assistant. We believe we have achieved a degree of conversational naturalness similar to that of the outstanding Air Traffic Information Systems dialogues, and we have done so with simpler natural language techniques. Currently, SpeechActs supports a handful of speech recognizers: BBN's Hark, Texas Instruments' Dagger, and Nuance Communications' recognizers. These recognizers are all continuous--they accept normally spoken speech with no artificial pauses between words--and speaker-independent--they require no training by individual users. For output, the framework provides text-to-speech support for Centigram's TruVoice and AT&T's TrueTalk. As an existing set of applications, SpeechActs is both a proof of concept and an effective system that about a dozen people now depend upon when they travel. Powerful enough to be useful, it is easy to use with little training. As a framework for building speech applications, SpeechActs' contributions include a Unified Grammar to create synchronized grammars for speech recognition and semantic parsing, reusable plug-in speech components, and the Swiftus natural language processor. SpeechActs also includes important discourse management techniques. Both the discourse stack and a simple context queue in SpeechActs model the current state of the discourse so that SpeechActs can respond naturally. These simple, straightforward components combine to make SpeechActs a powerful framework in which to design speech applications.