Attention, intentions, and the structure of discourse
Computational Linguistics
VoiceNotes: a speech interface for a hand-held voice notetaker
CHI '93 Proceedings of the INTERACT '93 and CHI '93 Conference on Human Factors in Computing Systems
Designing SpeechActs: issues in speech user interfaces
CHI '95 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Recent improvements in the CMU spoken language understanding system
HLT '94 Proceedings of the workshop on Human Language Technology
MedSpeak: report creation with continuous speech recognition
Proceedings of the ACM SIGCHI Conference on Human factors in computing systems
Spoken dialogue technology: enabling the conversational user interface
ACM Computing Surveys (CSUR)
Embedded Grammar Tags: Advancing Natural Language Interaction on the Web
IEEE Intelligent Systems
Clique: a conversant, task-based audio display for GUI applications
ACM SIGACCESS Accessibility and Computing
K-QARD: a practical Korean question answering framework for restricted domain
COLING-ACL '06 Proceedings of the COLING/ACL on Interactive presentation sessions
Out from behind the curtain: learning from a human auditory display
CHI '09 Extended Abstracts on Human Factors in Computing Systems
The "casual cashmere diaper bag": constraining speech recognition using examples
ISDS '97 Interactive Spoken Dialog Systems on Bringing Speech and NLP Together in Real Applications
Adding speech recognition support to UML tools
Journal of Visual Languages and Computing
Hi-index | 4.10 |
SpeechActs is a prototype testbed for developing spoken natural language applications. It lets software developers without special expertise in speech or natural language create applications with which users can speak naturally, as if they were conversing with a personal assistant. We believe we have achieved a degree of conversational naturalness similar to that of the outstanding Air Traffic Information Systems dialogues, and we have done so with simpler natural language techniques. Currently, SpeechActs supports a handful of speech recognizers: BBN's Hark, Texas Instruments' Dagger, and Nuance Communications' recognizers. These recognizers are all continuous--they accept normally spoken speech with no artificial pauses between words--and speaker-independent--they require no training by individual users. For output, the framework provides text-to-speech support for Centigram's TruVoice and AT&T's TrueTalk. As an existing set of applications, SpeechActs is both a proof of concept and an effective system that about a dozen people now depend upon when they travel. Powerful enough to be useful, it is easy to use with little training. As a framework for building speech applications, SpeechActs' contributions include a Unified Grammar to create synchronized grammars for speech recognition and semantic parsing, reusable plug-in speech components, and the Swiftus natural language processor. SpeechActs also includes important discourse management techniques. Both the discourse stack and a simple context queue in SpeechActs model the current state of the discourse so that SpeechActs can respond naturally. These simple, straightforward components combine to make SpeechActs a powerful framework in which to design speech applications.