Improving classification performance using unlabeled data: Naive Bayesian case

Authors:
Chang-Hwan Lee
Affiliations:
Department of Information and Communications, DongGuk University, 3-26 Pil-Dong, Chung-Gu, Seoul 100-715, Republic of Korea
Venue:
Knowledge-Based Systems
Year:
2007

Citing 8
Cited 6

On the exponential value of labeled samples

Pattern Recognition Letters
Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Text Classification from Labeled and Unlabeled Documents using EM

Machine Learning - Special issue on information retrieval
Analyzing the effectiveness and applicability of co-training

Proceedings of the ninth international conference on Information and knowledge management
Enhancing Supervised Learning with Unlabeled Data

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Active learning with statistical models

Journal of Artificial Intelligence Research
Active learning with committees for text categorization

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence

Multinomial mixture model with feature selection for text clustering

Knowledge-Based Systems
Scenario analysis using Bayesian networks: A case study in energy sector

Knowledge-Based Systems
Semi-supervised learning based on nearest neighbor rule and cut edges

Knowledge-Based Systems
Chinese text classification by the Naïve Bayes Classifier and the associative classifier with multiple confidence threshold values

Knowledge-Based Systems
A 'non-parametric' version of the naive Bayes classifier

Knowledge-Based Systems
Improving the performance of association classifiers by rule prioritization

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In many applications, an enormous amount of unlabeled data is available with little cost. Therefore, it is natural to ask whether we can take advantage of these unlabeled data in classification learning. In this paper, we analyzed the role of unlabeled data in the context of naive Bayesian learning. Experimental results show that including unlabeled data as part of training data can significantly improve the performance of classification accuracy.