Context dependent modeling of phones in continuous speech using decision trees
HLT '91 Proceedings of the workshop on Speech and Natural Language
Automatic Speech Recognition: The Development of the Sphinx Recognition System
Automatic Speech Recognition: The Development of the Sphinx Recognition System
A one pass decoder design for large vocabulary recognition
HLT '94 Proceedings of the workshop on Human Language Technology
Retrieving spoken documents by combining multiple index sources
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Open-vocabulary speech indexing for voice and video mail retrieval
MULTIMEDIA '96 Proceedings of the fourth ACM international conference on Multimedia
Using tone information in Cantonese continuous speech recognition
ACM Transactions on Asian Language Information Processing (TALIP)
Diphone subspace mixture trajectory models for HMM Complementation
Speech Communication
Decision Tree Based Clustering
IDEAL '02 Proceedings of the Third International Conference on Intelligent Data Engineering and Automated Learning
Taiscéalaí: Information Retrieval from an Archive of Spoken Radio News
ECDL '98 Proceedings of the Second European Conference on Research and Advanced Technology for Digital Libraries
German and Czech Speech Synthesis Using HMM-Based Speech Segment Database
TSD '02 Proceedings of the 5th International Conference on Text, Speech and Dialogue
Building a New Czech Text-to-Speech System Using Triphone-Based Speech Units
TDS '00 Proceedings of the Third International Workshop on Text, Speech and Dialogue
An Automatic Speech Translation System on PDAs for Travel Conversation
ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Covariance-Tied Clustering Method In Speaker Identification
ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
ICPR '98 Proceedings of the 14th International Conference on Pattern Recognition-Volume 2 - Volume 2
1993 benchmark tests for the ARPA spoken language program
HLT '94 Proceedings of the workshop on Human Language Technology
A one pass decoder design for large vocabulary recognition
HLT '94 Proceedings of the workshop on Human Language Technology
Language-dependent state clustering for multilingual acoustic modelling
Speech Communication
Acoustic variability and automatic recognition of children's speech
Speech Communication
IEEE Transactions on Computers
Tone correctness improvement in speaker dependent HMM-based Thai speech synthesis
Speech Communication
The application of hidden Markov models in speech recognition
Foundations and Trends in Signal Processing
Mandarin short message dictation on Symbian series 60 mobile phones
Mobility '07 Proceedings of the 4th international conference on mobile technology, applications, and systems and the 1st international symposium on Computer human interaction in mobile technology
The ASRS_RL --- A Research Platform for Spoken Language Recognition and Understanding Experiments
ICCSA '08 Proceedings of the international conference on Computational Science and Its Applications, Part II
Development of a Speech Recognizer with the Tecnovoz Database
PROPOR '08 Proceedings of the 8th international conference on Computational Processing of the Portuguese Language
Improving robustness of MLLR adaptation with speaker-clustered regression class trees
Computer Speech and Language
Context-dependent alignment models for statistical machine translation
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
AIA '08 Proceedings of the 26th IASTED International Conference on Artificial Intelligence and Applications
Automatic speech recognition for under-resourced languages: application to Vietnamese language
IEEE Transactions on Audio, Speech, and Language Processing
A hybrid approach to adapting acoustic and pronunciation models for non-native speech recognition
Asilomar'09 Proceedings of the 43rd Asilomar conference on Signals, systems and computers
IEEE Transactions on Audio, Speech, and Language Processing
Decision trees for lexical smoothing in statistical machine translation
WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
The subspace Gaussian mixture model-A structured model for speech recognition
Computer Speech and Language
Rule-based triphone mapping for acoustic modeling in automatic speech recognition
TSD'11 Proceedings of the 14th international conference on Text, speech and dialogue
Acoustic modeling problem for automatic speech recognition system: conventional methods (Part I)
International Journal of Speech Technology
Statistical modelling in continuous speech recognition (CSR)
UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence
Reliable unseen model prediction for vocabulary-independent speech recognition
AI'04 Proceedings of the 17th Australian joint conference on Advances in Artificial Intelligence
Multi-accent acoustic modelling of South African English
Speech Communication
Video mail retrieval using voice: an overview of the stage 2 system
MIRO'95 Proceedings of the Final conference on Multimedia Information Retrieval
ICMI'12 grand challenge: haptic voice recognition
Proceedings of the 14th ACM international conference on Multimodal interaction
Eigentrigraphemes for under-resourced languages
Speech Communication
Hi-index | 0.00 |
The key problem to be faced when building a HMM-based continuous speech recogniser is maintaining the balance between model complexity and available training data. For large vocabulary systems requiring cross-word context dependent modelling, this is particularly acute since many such contexts will never occur in the training data. This paper describes a method of creating a tied-state continuous speech recognition system using a phonetic decision tree. This tree-based clustering is shown to lead to similar recognition performance to that obtained using an earlier data-driven approach but to have the additional advantage of providing a mapping for unseen triphones. State-tying is also compared with traditional model-based tying and shown to be clearly superior. Experimental results are presented for both the Resource Management and Wall Street Journal tasks.