SpeechActs: A Spoken-Language Framework

Authors:
Paul Martin;Frederick Crabbe;Stuart Adams;Eric Baatz;Nicole Yankelovich
Affiliations:
-;-;-;-;-
Venue:
Computer
Year:
1996

Citing 4
Cited 8

Attention, intentions, and the structure of discourse

Computational Linguistics
VoiceNotes: a speech interface for a hand-held voice notetaker

CHI '93 Proceedings of the INTERACT '93 and CHI '93 Conference on Human Factors in Computing Systems
Designing SpeechActs: issues in speech user interfaces

CHI '95 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Recent improvements in the CMU spoken language understanding system

HLT '94 Proceedings of the workshop on Human Language Technology

MedSpeak: report creation with continuous speech recognition

Proceedings of the ACM SIGCHI Conference on Human factors in computing systems
Spoken dialogue technology: enabling the conversational user interface

ACM Computing Surveys (CSUR)
Embedded Grammar Tags: Advancing Natural Language Interaction on the Web

IEEE Intelligent Systems
Clique: a conversant, task-based audio display for GUI applications

ACM SIGACCESS Accessibility and Computing
K-QARD: a practical Korean question answering framework for restricted domain

COLING-ACL '06 Proceedings of the COLING/ACL on Interactive presentation sessions
Out from behind the curtain: learning from a human auditory display

CHI '09 Extended Abstracts on Human Factors in Computing Systems
The "casual cashmere diaper bag": constraining speech recognition using examples

ISDS '97 Interactive Spoken Dialog Systems on Bringing Speech and NLP Together in Real Applications
Adding speech recognition support to UML tools

Journal of Visual Languages and Computing

Quantified Score

Hi-index	4.10

Visualization

Abstract

SpeechActs is a prototype testbed for developing spoken natural language applications. It lets software developers without special expertise in speech or natural language create applications with which users can speak naturally, as if they were conversing with a personal assistant. We believe we have achieved a degree of conversational naturalness similar to that of the outstanding Air Traffic Information Systems dialogues, and we have done so with simpler natural language techniques. Currently, SpeechActs supports a handful of speech recognizers: BBN's Hark, Texas Instruments' Dagger, and Nuance Communications' recognizers. These recognizers are all continuous--they accept normally spoken speech with no artificial pauses between words--and speaker-independent--they require no training by individual users. For output, the framework provides text-to-speech support for Centigram's TruVoice and AT&T's TrueTalk. As an existing set of applications, SpeechActs is both a proof of concept and an effective system that about a dozen people now depend upon when they travel. Powerful enough to be useful, it is easy to use with little training. As a framework for building speech applications, SpeechActs' contributions include a Unified Grammar to create synchronized grammars for speech recognition and semantic parsing, reusable plug-in speech components, and the Swiftus natural language processor. SpeechActs also includes important discourse management techniques. Both the discourse stack and a simple context queue in SpeechActs model the current state of the discourse so that SpeechActs can respond naturally. These simple, straightforward components combine to make SpeechActs a powerful framework in which to design speech applications.