The voice source in connected speech
Speech Communication - Special issue on speech production: models and data
Prosody-based automatic segmentation of speech into sentences and topics
Speech Communication - Special issue on accessing information in spoken audio
ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics
Modeling disfluency and background events in ASR for a natural language understanding task
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Hi-index | 0.00 |
Word fragments pose serious problems for speech recognizers. Accurate identification of word fragments will not only improve recognition accuracy, but also be very helpful for disfluency detection algorithm because the occurrence of word fragments is a good indicator of speech disfluencies. Different from the previous effort of including word fragments in the acoustic model, in this paper, we investigate the problem of word fragment identification from another approach, i.e. building classifiers using acoustic-prosodic features. Our experiments show that, by combining a few voice quality measures and prosodic features extracted from the forced alignments with the human transcriptions, we obtain a precision rate of 74.3% and a recall rate of 70.1% on the downsampled data of spontaneous speech. The overall accuracy is 72.9%, which is significantly better than chance performance of 50%.