Pervasive Speech and Language Technology

  • Authors:
  • Wolfgang Wahlster

  • Affiliations:
  • -

  • Venue:
  • Informatics - 10 Years Back. 10 Years Ahead.
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Advances in human language technology offer the promise of pervasive access to on-line information and electronic services. Since almost everyone speaks and understands a language, the development of natural language systems will allow the average person to interact with computers anytime and any where without special skills or training, using common devices such as a mobile telephone. The latest results and component technologies for multilingual and robust speech processing, prosodic analysis, parsing, semantic analysis, discourse understanding, translation, and speech synthesis are reviewed using the Verbmobil system as an example. Verbmobil is a speaker-independent and bidirectional speech-to-speech translation system for spontaneous dialogs in mobile situations. It recognizes spoken input, analyses and translates it, and finally utters the translation. The multilingual system handles dialogs in three business-oriented domains, with context-sensitive translation between three languages (German, English, and Japanese). We will show that the most successful current systems are based on hybrid architectures incorporating both deep and shallow processing schemes. They integrate a broad spectrum of statistical and rule-based methods and combine the results of machine learning from large corpora with linguists' hand-crafted knowledge sources to achieve an adequate level of robustness and accuracy. We argue that packed representations together with formalisms for underspecification capture the uncertainties in each processing phase, so that these uncertainties can be reduced by linguistic, discourse and domain constraints as soon as they become applicable. We show that the current core technologies for natural language and speech processing enable us to create the next generation of information extraction and summarization systems for the Web, speech-based Internet access and multimodal communication assistants combining speech and gesture.