Speech and text-image processing in documents

Authors:
Marcia A. Bush
Affiliations:
Xerox Palo Alto Research Center, Palo Alto, CA
Venue:
HLT '93 Proceedings of the workshop on Human Language Technology
Year:
1993

Citing 4
Cited 1

Evaluation of spoken language systems: the ATIS domain

HLT '90 Proceedings of the workshop on Speech and Natural Language
Wordspotting for voice editing and audio indexing

CHI '92 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
GNU EMACS Manual

GNU EMACS Manual
Robot Vision

Robot Vision

Fax: an alternative to SGML

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

Two themes have evolved in speech and text image processing work at Xerox PARC that expand and redefine the role of recognition technology in document-oriented applications. One is the development of systems that provide functionality similar to that of text processors but operate directly on audio and scanned image data. A second, related theme is the use of speech and text-image recognition to retrieve arbitrary, user-specified information from documents with signal content. This paper discusses three research initiatives at PARC that exemplify these themes: a text-image editor[1], a wordspotter for voice editing and indexing[12], and a decoding framework for scanned-document content retrieval[4]. The discussion focuses on key concepts embodied in the research that enable novel signal-based document processing functionality.