Content-based language models for spoken document retrieval

  • Authors:
  • Hsin-min Wang;Berlin Chen

  • Affiliations:
  • Institute of Information Science, Academia Sinica, Taipei, Taiwan, R.O.C;Institute of Information Science, Academia Sinica, Taipei, Taiwan, R.O.C

  • Venue:
  • IRAL '00 Proceedings of the fifth international workshop on on Information retrieval with Asian languages
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Spoken document retrieval (SDR) has been extensively studied in recent years because of its potential use in navigating large multimedia collections in the near future. This paper presents a novel concept of applying the content-based language models to spoken document retrieval. In an example task for retrieval of Mandarin broadcast news, the content-based language models either trained with the automatic transcriptions of the spoken documents or adapted from the baseline language models using the automatic transcriptions of the spoken documents were used to create the more accurate recognition results and indexing terms from both the spoken documents and the speech queries. We report on some interesting findings obtained in this research.