Environmental adaptation for robust speech recognition
Environmental adaptation for robust speech recognition
Controlling the complexity of HMM systems by regularization
Proceedings of the 1998 conference on Advances in neural information processing systems II
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Spoken Language Processing: A Guide to Theory, Algorithm, and System Development
Spoken Language Processing: A Guide to Theory, Algorithm, and System Development
Temporal classification: extending the classification paradigm to multivariate time series
Temporal classification: extending the classification paradigm to multivariate time series
Hidden Markov models for automatic annotation and content-based retrieval of images and video
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Deterministic Annealing EM Algorithm in Acoustic Modeling for Speaker and Speech Recognition
IEICE - Transactions on Information and Systems
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
Spoken language processing: Piecing together the puzzle
Speech Communication
Initialization of hidden Markov models for unconstrained on-line handwriting recognition
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 06
Incremental enrolment of speech recognizers
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Using a large vocabulary continuous speech recognizer for a constrained domain with limited training
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Markov Models for Pattern Recognition: From Theory to Applications
Markov Models for Pattern Recognition: From Theory to Applications
Large-margin minimum classification error training: A theoretical risk minimization perspective
Computer Speech and Language
Language Acquisition: The Emergence of Words from Multimodal Input
TSD '08 Proceedings of the 11th international conference on Text, Speech and Dialogue
A Computational Model of Language Acquisition: the Emergence of Words
Fundamenta Informaticae - Cognitive Informatics, Cognitive Computing, and Their Denotational Mathematical Foundations (I)
ACORNS - towards computational modeling of communication and recognition skills
COGINF '07 Proceedings of the 6th IEEE International Conference on Cognitive Informatics
Robots that learn language: developmental approach to human-machine conversations
EELC'06 Proceedings of the Third international conference on Emergence and Evolution of Linguistic Communication: symbol Grounding and Beyond
Minimum phone error training of precision matrix models
IEEE Transactions on Audio, Speech, and Language Processing
Grounded spoken language acquisition: experiments in word learning
IEEE Transactions on Multimedia
Hi-index | 0.00 |
In this paper we present an incremental word learning system that is able to cope with few training data samples to enable speech acquisition in on-line human robot interaction. As with most automatic speech recognition systems (ASR), our architecture relies on a Hidden Markov Model (HMM) framework where the different word models are sequentially trained and the system has little prior knowledge. To achieve good performance, HMMs depends on the amount of training data, the initialization procedure and the efficiency of the discriminative training algorithms. Thus, we propose different approaches to improve the system. One major problem of using a small amount of training data is over-fitting. Hence we present a novel estimation of the variance floor dependent on the number of available training samples. Next, we propose a bootstrapping approach in order to get a good initialization of the HMM parameters. This method is based on unsupervised training of the parameters and subsequent construction of a new HMM by aligning and merging Viterbi decoded sequences. Finally, we investigate large margin discriminative training techniques to enlarge the generalization performance of the models using several strategies suitable for limited training data. In the evaluation of the results, we examine the contribution of the different stages proposed to the overall system performance. This includes the comparison of different state-of-the-art methods with our presented techniques and the investigation of the possible reduction of the number of training data samples. We compare our algorithms on isolated and continuous digit recognition tasks. To sum up, we show that the proposed algorithms yield significant improvements and are a step towards efficient learning with few examples.