An introduction to text-to-speech synthesis
An introduction to text-to-speech synthesis
Speech recognition by machines and humans
Speech Communication
Improvements in Speech Synthesis
Improvements in Speech Synthesis
The Structure of Multimodal Dialogue
The Structure of Multimodal Dialogue
Imitation: a means to enhance learning of a synthetic protolanguage in autonomous robots
Imitation in animals and artifacts
Challenges in adopting speech recognition
Communications of the ACM - Multimodal interfaces that flex, adapt, and persist
On Intelligence
Spoken Dialogue Technology
Experiences collecting genuine spoken enquiries using WOZ techniques
HLT '91 Proceedings of the workshop on Speech and Natural Language
Wired for Speech: How Voice Activates and Advances the Human-Computer Relationship
Wired for Speech: How Voice Activates and Advances the Human-Computer Relationship
Partially observable Markov decision processes for spoken dialog systems
Computer Speech and Language
Spoken language processing: Piecing together the puzzle
Speech Communication
PRESENCE: A Human-Inspired Architecture for Speech-Based Human-Machine Interaction
IEEE Transactions on Computers
The application of hidden Markov models in speech recognition
Foundations and Trends in Signal Processing
Invited paper: Automatic speech recognition: History, methods and challenges
Pattern Recognition
Connection Science - Language and Robots
Incremental dialogue processing in a micro-domain
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
The Hidden Information State model: A practical framework for POMDP-based spoken dialogue management
Computer Speech and Language
A case-based approach to dialogue systems
Journal of Experimental & Theoretical Artificial Intelligence
Bayesian update of dialogue state: A POMDP framework for spoken dialogue systems
Computer Speech and Language
A prototype for a conversational companion for reminiscing about images
Computer Speech and Language
Hi-index | 0.00 |
Recent years have seen steady improvements in the quality and performance of speech-based human-machine interaction driven by a significant convergence in the methods and techniques employed. However, the quantity of training data required to improve state-of-the-art systems seems to be growing exponentially, and yet performance appears to be reaching an asymptote that is not only well short of human performance, but which may also be inadequate for many real-world applications. This situation suggests that there may be a fundamental flaw in the underlying architecture of contemporary speech-based systems, and the future direction for research into spoken language processing is currently uncertain. This chapter addresses these issues by stepping outside the familiar domains of speech science and technology, and instead draws inspiration from recent findings in fields of research that are concerned with the neurobiology of living systems in general. In particular, four areas are highlighted: the growing evidence for an intimate relationship between sensor and motor behaviour in living organisms, the power of negative feedback control to accommodate unpredictable disturbances in real-world environments, mechanisms for imitation and mental imagery for learning and modelling, and hierarchical models of temporal memory for predicting future behaviour and anticipating the outcome of events. The chapter shows how these results point towards a novel architecture for speech-based human-machine interaction that blurs the distinction between the core components of a traditional spoken language dialogue system; an architecture in which cooperative and communicative behaviour emerges as a by-product of a model of interaction where the system has in mind the needs and intentions of a user, and a user has in mind the needs and intentions of the system. It concludes with a roadmap of technical pre-requisites and desiderata that would seem to be necessary if voice-based interaction with an autonomous agent such as a virtual butler is to become a practical reality.