Semisupervised Learning of Hidden Markov Models via a Homotopy Method

Authors:
Shihao Ji;Layne T. Watson;Lawrence Carin
Affiliations:
Duke University, Durham;Virginia Polytechnic Institute and State Univeristy, Blacksburg;Duke University, Durham
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
2009

Citing 0
Cited 8

Self-training Strategies for Handwriting Word Recognition

ICDM '09 Proceedings of the 9th Industrial Conference on Advances in Data Mining. Applications and Theoretical Aspects
On semi-supervised learning of Gaussian mixture models for phonetic classification

SemiSupLearn '09 Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing
Coarse-to-fine boundary location with a SOM-like method

IEEE Transactions on Neural Networks
Using interesting sequences to interactively build Hidden Markov Models

Data Mining and Knowledge Discovery
Self-training for handwritten text line recognition

CIARP'10 Proceedings of the 15th Iberoamerican congress conference on Progress in pattern recognition, image analysis, computer vision, and applications
Probability-one homotopy maps for tracking constrained clustering solutions

Proceedings of the High Performance Computing Symposium
Keyword spotting for self-training of BLSTM NN based handwriting recognition systems

Pattern Recognition
Joint semi-supervised learning of Hidden Conditional Random Fields and Hidden Markov Models

Pattern Recognition Letters

Quantified Score

Hi-index	0.14

Visualization

Abstract

Hidden Markov model (HMM) classifier design is considered for the analysis of sequential data, incorporating both labeled and unlabeled data for training; the balance between the use of labeled and unlabeled data is controlled by an allocation parameter \lambda \in [0, 1), where \lambda = 0 corresponds to purely supervised HMM learning (based only on the labeled data) and \lambda = 1 corresponds to unsupervised HMM-based clustering (based only on the unlabeled data). The associated estimation problem can typically be reduced to solving a set of fixed-point equations in the form of a ânatural-parameter homotopy.â This paper applies a homotopy method to track a continuous path of solutions, starting from a local supervised solution (\lambda = 0) to a local unsupervised solution (\lambda = 1). The homotopy method is guaranteed to track with probability one from \lambda = 0 to \lambda = 1 if the \lambda = 0 solution is unique; this condition is not satisfied for the HMM since the maximum likelihood supervised solution (\lambda = 0) is characterized by many local optima. A modified form of the homotopy map for HMMs assures a track from \lambda = 0 to \lambda = 1. Following this track leads to a formulation for selecting \lambda \in [0, 1) for a semisupervised solution and it also provides a tool for selection from among multiple local-optimal supervised solutions. The results of applying the proposed method to measured and synthetic sequential data verify its robustness and feasibility compared to the conventional EM approach for semisupervised HMM training.