Architecture, User Interface, and Enabling Technology in Windows Vista's Speech Systems
IEEE Transactions on Computers
Review: Speaker segmentation and clustering
Signal Processing
The application of hidden Markov models in speech recognition
Foundations and Trends in Signal Processing
Domain Adaptation of a Broadcast News Transcription System for the Portuguese Parliament
PROPOR '08 Proceedings of the 8th international conference on Computational Processing of the Portuguese Language
Automatic Classification and Transcription of Telephone Speech in Radio Broadcast Data
PROPOR '08 Proceedings of the 8th international conference on Computational Processing of the Portuguese Language
Directed decision trees for generating complementary systems
Speech Communication
Speaker diarization exploiting the eigengap criterion and cluster ensembles
IEEE Transactions on Audio, Speech, and Language Processing
The efficient incorporation of MLP features into automatic speech recognition systems
Computer Speech and Language
A review on speaker diarization systems and approaches
Speech Communication
Hi-index | 0.00 |
Broadcast news (BN) transcription has been a challenging research area for many years. In the last couple of years, the availability of large amounts of roughly transcribed acoustic training data and advanced model training techniques has offered the opportunity to greatly reduce the error rate on this task. This paper describes the design and performance of BN transcription systems which make use of these developments. First, the effects of using lightly supervised training data and advanced acoustic modeling techniques are discussed. The design of a real-time broadcast news recognition system is then detailed using these new models. As system combination has been found to yield large gains in performance, a range of frameworks that allow multiple recognition outputs to be combined are next described. These include the use of multiple types of acoustic models and multiple segmentations. As a contrast a system developed by multiple sites allowing cross-site combination, the "SuperEARS" system, is also described. The various models and recognition configurations are evaluated using several recent BN development and evaluation test sets. These new BN transcription systems can give gains of over 25% relative to the CU-HTK 2003 BN system