Pervasive Speech and Language Technology

Authors:
Wolfgang Wahlster
Affiliations:
-
Venue:
Informatics - 10 Years Back. 10 Years Ahead.
Year:
2001

Citing 5
Cited 0

Survey of the state of the art in human language technology

Survey of the state of the art in human language technology
Readings in intelligent user interfaces

Readings in intelligent user interfaces
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition

Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition
Verbmobil: The Combination of Deep and Shallow Processing for Spontaneous Speech Translation

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 1 - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

Advances in human language technology offer the promise of pervasive access to on-line information and electronic services. Since almost everyone speaks and understands a language, the development of natural language systems will allow the average person to interact with computers anytime and any where without special skills or training, using common devices such as a mobile telephone. The latest results and component technologies for multilingual and robust speech processing, prosodic analysis, parsing, semantic analysis, discourse understanding, translation, and speech synthesis are reviewed using the Verbmobil system as an example. Verbmobil is a speaker-independent and bidirectional speech-to-speech translation system for spontaneous dialogs in mobile situations. It recognizes spoken input, analyses and translates it, and finally utters the translation. The multilingual system handles dialogs in three business-oriented domains, with context-sensitive translation between three languages (German, English, and Japanese). We will show that the most successful current systems are based on hybrid architectures incorporating both deep and shallow processing schemes. They integrate a broad spectrum of statistical and rule-based methods and combine the results of machine learning from large corpora with linguists' hand-crafted knowledge sources to achieve an adequate level of robustness and accuracy. We argue that packed representations together with formalisms for underspecification capture the uncertainties in each processing phase, so that these uncertainties can be reduced by linguistic, discourse and domain constraints as soon as they become applicable. We show that the current core technologies for natural language and speech processing enable us to create the next generation of information extraction and summarization systems for the Web, speech-based Internet access and multimodal communication assistants combining speech and gesture.