Learning Instance Weighted Naive Bayes from labeled and unlabeled data

Authors:
Liangxiao Jiang
Affiliations:
Department of Computer Science, China University of Geosciences, Wuhan, China 430074
Venue:
Journal of Intelligent Information Systems
Year:
2012

Citing 10
Cited 0

Text Classification from Labeled and Unlabeled Documents using EM

Machine Learning - Special issue on information retrieval
Transductive Inference for Text Classification using Support Vector Machines

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Learning from Labeled and Unlabeled Data using Graph Mincuts

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Inference for the Generalization Error

Machine Learning
Semi-Supervised Self-Training of Object Detection Models

WACV-MOTION '05 Proceedings of the Seventh IEEE Workshops on Application of Computer Vision (WACV/MOTION'05) - Volume 1 - Volume 01
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Survey of Improving Naive Bayes for Classification

ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
Semi-Supervised Learning

Semi-Supervised Learning
Using weighted nearest neighbor to benefit from unlabeled data

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Locally weighted naive bayes

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

In real-world data mining applications, it is often the case that unlabeled instances are abundant, while available labeled instances are very limited. Thus, semi-supervised learning, which attempts to benefit from large amount of unlabeled data together with labeled data, has attracted much attention from researchers. In this paper, we propose a very fast and yet highly effective semi-supervised learning algorithm. We call our proposed algorithm Instance Weighted Naive Bayes (simply IWNB). IWNB firstly trains a naive Bayes using the labeled instances only. And the trained naive Bayes is used to estimate the class membership probabilities of the unlabeled instances. Then, the estimated class membership probabilities are used to label and weight unlabeled instances. At last, a naive Bayes is trained again using both the originally labeled data and the (newly labeled and weighted) unlabeled data. Our experimental results based on a large number of UCI data sets show that IWNB often improves the classification accuracy of original naive Bayes when available labeled data are very limited.