Transferring naive bayes classifiers for text classification

Authors:
Wenyuan Dai;Gui-Rong Xue;Qiang Yang;Yong Yu
Affiliations:
Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China;Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China;Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong;Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
Venue:
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Year:
2007

Citing 12
Cited 29

A training algorithm for optimal margin classifiers

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Representation and learning in information retrieval

Representation and learning in information retrieval
Multitask Learning

Machine Learning - Special issue on inductive transfer
Text Classification from Labeled and Unlabeled Documents using EM

Machine Learning - Special issue on information retrieval
Partially Supervised Classification of Text Documents

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Transductive Inference for Text Classification using Support Vector Machines

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Learning and evaluating classifiers under sample selection bias

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Improving SVM accuracy by training on auxiliary data sources

ICML '04 Proceedings of the twenty-first international conference on Machine learning
An EM Based Training Algorithm for Cross-Language Text Categorization

WI '05 Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence
Constructing informative priors using transfer learning

ICML '06 Proceedings of the 23rd international conference on Machine learning
Domain adaptation for statistical classifiers

Journal of Artificial Intelligence Research
Learning one more thing

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2

Spectral domain-transfer learning

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Cross-Domain Knowledge Transfer Using Semi-supervised Classification

AI '08 Proceedings of the 21st Australasian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
Adapting Naive Bayes to Domain Adaptation for Sentiment Analysis

ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Exploring social tagging graph for web object classification

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Sentiment analysis of Chinese documents: From sentence to document level

Journal of the American Society for Information Science and Technology
Co-training for cross-lingual sentiment classification

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Efficient Text Classification Using Term Projection

AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
Transfer Learning beyond Text Classification

ACML '09 Proceedings of the 1st Asian Conference on Machine Learning: Advances in Machine Learning
Urdu text classification

Proceedings of the 7th International Conference on Frontiers of Information Technology
Three challenges in data mining

Frontiers of Computer Science in China
Frustratingly easy semi-supervised domain adaptation

DANLP 2010 Proceedings of the 2010 Workshop on Domain Adaptation for Natural Language Processing
Cross language text classification by model translation and semi-supervised learning

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
A robust semi-supervised classification method for transfer learning

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Learning naïve bayes transfer classifier throughclass-wise test distribution estimation

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Video summarization via transferrable structured learning

Proceedings of the 20th international conference on World wide web
Knowledge transfer based on feature representation mapping for text classification

Expert Systems with Applications: An International Journal
Transfer learning via multi-view principal component analysis

Journal of Computer Science and Technology - Special issue on natural language processing
An alternative approach for statistical single-label document classification of newspaper articles

Journal of Information Science
Sentiment analysis with a multilingual pipeline

WISE'11 Proceedings of the 12th international conference on Web information system engineering
Bilingual co-training for sentiment classification of chinese product reviews

Computational Linguistics
Transfer learning for cross-company software defect prediction

Information and Software Technology
Transferring knowledge of activity recognition across sensor networks

Pervasive'10 Proceedings of the 8th international conference on Pervasive Computing
Kinship verification through transfer learning

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
A term association translation model for naive bayes text classification

PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
A Comparative Study of Cross-Lingual Sentiment Classification

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Chinese terminology extraction using EM-Based transfer learning method

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Transfer defect learning

Proceedings of the 2013 International Conference on Software Engineering
A transfer learning based framework of crowd-selection on twitter

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Inferring the demographics of search users: social data meets search queries

Proceedings of the 22nd international conference on World Wide Web

Quantified Score

Hi-index	0.00

Visualization

Abstract

A basic assumption in traditional machine learning is that the training and test data distributions should be identical. This assumption may not hold in many situations in practice, but we may be forced to rely on a different-distribution data to learn a prediction model. For example, this may be the case when it is expensive to label the data in a domain of interest, although in a related but different domain there may be plenty of labeled data available. In this paper, we propose a novel transfer-learning algorithm for text classification based on an EM-based Naive Bayes classifiers. Our solution is to first estimate the initial probabilities under a distribution Dl of one labeled data set, and then use an EM algorithm to revise the model for a different distribution Du of the test data which are unlabeled. We show that our algorithm is very effective in several different pairs of domains, where the distances between the different distributions are measured using the Kullback-Leibler (KL) divergence. Moreover, KL-divergence is used to decide the trade-off parameters in our algorithm. In the experiment, our algorithm outperforms the traditional supervised and semi-supervised learning algorithms when the distributions of the training and test sets are increasingly different.