Effects of speaking rate and word frequency on pronunciations in conversational speech
Speech Communication - Special issue on modeling pronunciation variation for automatic speech recognition
Shallow parsing with pos taggers and linguistic features
The Journal of Machine Learning Research
TnT: a statistical part-of-speech tagger
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
The effect of language model probability on pronunciation reduction
ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
Hybrid statistical pronunciation models designed to be trained by a medium-size corpus
Computer Speech and Language
Hi-index | 0.00 |
A detailed description of the discourse context of a word can be used for predicting word pronunciation in discourse context and also enables studies of the interplay between various types of information on e.g. phone-level pronunciation. The work presented in this paper is aimed at modelling systematic variation in the phone-level realisation of words inherent to a language variety. A data-driven approach based on access to detailed discourse context descriptions is used. The discourse context descriptions are constructed through annotation of spoken language with a large variety of linguistic and related variables in multiple layers. Decision tree pronunciation models are induced from the annotation. The effects of using different types and different amounts of information for model induction are explored. Models generated in a tenfold cross-validation experiment produce on average 8.2% errors on the phone level when they are trained on all available information. Models trained on phoneme level information only have an average phone error rate of 14.2%. This means that including information above the phoneme level in the context description can improve model performance by 42.2%.