Progress in the CU-HTK broadcast news transcription system

  • Authors:
  • M. J.F. Gales;Do Yeong Kim;P. C. Woodland;Ho Yin Chan;D. Mrva;R. Sinha;S. E. Tranter

  • Affiliations:
  • Eng. Dept., Cambridge Univ.;-;-;-;-;-;-

  • Venue:
  • IEEE Transactions on Audio, Speech, and Language Processing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Broadcast news (BN) transcription has been a challenging research area for many years. In the last couple of years, the availability of large amounts of roughly transcribed acoustic training data and advanced model training techniques has offered the opportunity to greatly reduce the error rate on this task. This paper describes the design and performance of BN transcription systems which make use of these developments. First, the effects of using lightly supervised training data and advanced acoustic modeling techniques are discussed. The design of a real-time broadcast news recognition system is then detailed using these new models. As system combination has been found to yield large gains in performance, a range of frameworks that allow multiple recognition outputs to be combined are next described. These include the use of multiple types of acoustic models and multiple segmentations. As a contrast a system developed by multiple sites allowing cross-site combination, the "SuperEARS" system, is also described. The various models and recognition configurations are evaluated using several recent BN development and evaluation test sets. These new BN transcription systems can give gains of over 25% relative to the CU-HTK 2003 BN system