Machine Learning
Analyzing the effectiveness and applicability of co-training
Proceedings of the ninth international conference on Information and knowledge management
Convergence Theorems for Generalized Alternating Minimization Procedures
The Journal of Machine Learning Research
Theoretical analysis of cross-validation(CV)-EM algorithm
ICANN'10 Proceedings of the 20th international conference on Artificial neural networks: Part III
Hi-index | 0.00 |
A new maximum likelihood training algorithm is proposed that compensates for weaknesses of the EM algorithm by using cross-validation likelihood in the expectation step to avoid overtraining. By using a set of sufficient statistics associated with a partitioning of the training data, as in parallel EM, the algorithm has the same order of computational requirements as the original EM algorithm. Another variation uses an approximation of bagging to reduce variance in the E-step but at a somewhat higher cost. Analyses using GMMs with artificial data show the proposed algorithms are more robust to overtraining than the conventional EM algorithm. Large vocabulary recognition experiments on Mandarin broadcast news data show that the methods make better use of more parameters and give lower recognition error rates than standard EM training.