Metadata for integrating speech documents in a text retrieval system

  • Authors:
  • Ulrike Glavitsch;Peter Schäuble;Martin Wechsler

  • Affiliations:
  • Institut für Informationssysteme, Swiss Federal Institute of Technology (ETH), CH-8092 Zürich (Switzerland);Institut für Informationssysteme, Swiss Federal Institute of Technology (ETH), CH-8092 Zürich (Switzerland);Institut für Informationssysteme, Swiss Federal Institute of Technology (ETH), CH-8092 Zürich (Switzerland)

  • Venue:
  • ACM SIGMOD Record
  • Year:
  • 1994

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present an information retrieval system that simultaneously allows to search for text and speech documents. The retrieval system accepts vague queries and performs a best-match search to find those documents that are relevant to the query. The output of the retrieval system is a list of ranked documents where the documents on the top of the list satisfy best the user's information need. The relevance of the documents is estimated by means of metadata (document description vectors). The metadata is automatically generated and it is organized such that queries can be processed efficiently. We introduce a controlled indexing vocabulary for both speech and text documents. The size of the new indexing vocabulary is small (1000 features) compared with the sizes of indexing vocabularies of conventional text retrieval (10000 - 100000 features). We show that the retrieval effectiveness based on such a small indexing vocabulary is similar to the retrieval effectiveness of a Boolean retrieval system.