Training a naive bayes classifier via the EM algorithm with a class distribution constraint

Authors:
Yoshimasa Tsuruoka;Jun'ichi Tsujii
Affiliations:
University of Tokyo, Bunkyo-ku, Tokyo, Japan;University of Tokyo, Bunkyo-ku, Tokyo, Japan
Venue:
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Year:
2003

Citing 16
Cited 7

The nature of statistical learning theory

The nature of statistical learning theory
A maximum entropy approach to natural language processing

Computational Linguistics
Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Knowledge lean word-sense disambiguation

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
A Winnow-Based Approach to Context-Sensitive Spelling Correction

Machine Learning - Special issue on natural language learning
Text Classification from Labeled and Unlabeled Documents using EM

Machine Learning - Special issue on information retrieval
BoosTexter: A Boosting-based Systemfor Text Categorization

Machine Learning - Special issue on information retrieval
Analyzing the effectiveness and applicability of co-training

Proceedings of the ninth international conference on Information and knowledge management
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Automatic Rule Acquisition for Spelling Correction

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Transductive Inference for Text Classification using Support Vector Machines

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Maximum entropy models for natural language ambiguity resolution

Maximum entropy models for natural language ambiguity resolution
A simple approach to building ensembles of Naive Bayesian classifiers for word sense disambiguation

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Unsupervised word sense disambiguation rivaling supervised methods

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Decision lists for lexical ambiguity resolution: application to accent restoration in Spanish and French

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Scaling to very very large corpora for natural language disambiguation

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics

Two supervised learning approaches for name disambiguation in author citations

Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
Using weak supervision in learning Gaussian mixture models

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
A cross-corpus study of unsupervised subjectivity identification based on calibrated EM

WASSA '11 Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis
Naïve bayes classifier based watermark detection in wavelet transform

MRCS'06 Proceedings of the 2006 international conference on Multimedia Content Representation, Classification and Security
Combining multiple classifiers using vote based classifier ensemble technique for named entity recognition

Data & Knowledge Engineering
Chinese terminology extraction using EM-Based transfer learning method

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
What's the deal?: identifying online bargains

AWC '13 Proceedings of the First Australasian Web Conference - Volume 144

Quantified Score

Hi-index	0.00

Visualization

Abstract

Combining a naive Bayes classifier with the EM algorithm is one of the promising approaches for making use of unlabeled data for disambiguation tasks when using local context features including word sense disambiguation and spelling correction. However, the use of unlabeled data via the basic EM algorithm often causes disastrous performance degradation instead of improving classification performance, resulting in poor classification performance on average. In this study, we introduce a class distribution constraint into the iteration process of the EM algorithm. This constraint keeps the class distribution of unlabeled data consistent with the class distribution estimated from labeled data, preventing the EM algorithm from converging into an undesirable state. Experimental results from using 26 confusion sets and a large amount of unlabeled data show that our proposed method for using unlabeled data considerably improves classification performance when the amount of labeled data is small.