Training conditional random fields with unlabeled data and limited number of labeled examples

Authors:
Tak-Lam Wong;Wai Lam
Affiliations:
Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, Hong Kong;Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, Hong Kong
Venue:
ICMLC'05 Proceedings of the 4th international conference on Advances in Machine Learning and Cybernetics
Year:
2005

Citing 15
Cited 0

Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
Inducing Features of Random Fields

IEEE Transactions on Pattern Analysis and Machine Intelligence
Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
An Algorithm that Learns What‘s in a Name

Machine Learning - Special issue on natural language learning
Text Classification from Labeled and Unlabeled Documents using EM

Machine Learning - Special issue on information retrieval
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Maximum Entropy Markov Models for Information Extraction and Segmentation

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Information Extraction with HMM Structures Learned by Stochastic Optimization

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
A Probabilistic Approach for Adapting Information Extraction Wrappers and Discovering New Attributes

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Chunking with support vector machines

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Shallow parsing with conditional random fields

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Hot Item Mining and Summarization from Multiple Auction Web Sites

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Introduction to the CoNLL-2000 shared task: chunking

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Efficiently inducing features of conditional random fields

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Conditional random fields is a probabilistic approach which has been applied to sequence labeling task achieving good performance. We attempt to extend the model so that human effort in preparing labeled training examples can be reduced by considering unlabeled data. Instead of maximizing the conditional likelihood, we aim at maximizing the likelihood of the observation of the sequences from both of the labeled and unlabeled data. We have conducted extensive experiments in two different data sets to evaluate the performance. The experimental results show that our model learned from both labeled and unlabeled data has a better performance over the model learned by only considering labeled training examples.