Bi-weighting domain adaptation for cross-language text classification

Authors:
Chang Wan;Rong Pan;Jiefei Li
Affiliations:
School of Information Science and Technology, Sun Yat-sen University, Guangzhou, China;School of Information Science and Technology, Sun Yat-sen University, Guangzhou, China;School of Information Science and Technology, Sun Yat-sen University, Guangzhou, China
Venue:
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Year:
2011

Citing 11
Cited 1

Elements of information theory

Elements of information theory
Transductive Inference for Text Classification using Support Vector Machines

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
An EM Based Training Algorithm for Cross-Language Text Categorization

WI '05 Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence
Proceedings of the 24th international conference on Machine learning

The 24th Annual International Conference on Machine Learning held in conjunction with the 2007 International Conference on Inductive Logic Programming
Boosting for transfer learning

Proceedings of the 24th international conference on Machine learning
Self-taught learning: transfer learning from unlabeled data

Proceedings of the 24th international conference on Machine learning
Can chinese web pages be classified with english data source?

Proceedings of the 17th international conference on World Wide Web
Manifold alignment using Procrustes analysis

Proceedings of the 25th international conference on Machine learning
Extracting discriminative concepts for domain adaptation in text mining

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Domain adaptation via transfer component analysis

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)

Triplex transfer learning: exploiting both shared and distinct concepts for text classification

Proceedings of the sixth ACM international conference on Web search and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Text classification is widely used in many real-world applications. To obtain satisfied classification performance, most traditional data mining methods require lots of labeled data, which can be costly in terms of both time and human efforts. In reality, there are plenty of such resources in English since it has the largest population in the Internet world, which is not true in many other languages. In this paper, we present a novel transfer learning approach to tackle the cross-language text classification problems. We first align the feature spaces in both domains utilizing some on-line translation service, which makes the two feature spaces under the same coordinate. Although the feature sets in both domains are the same, the distributions of the instances in both domains are different, which violates the i.i.d. assumption in most traditional machine learning methods. For this issue, we propose an iterative feature and instance weighting (Bi-Weighting) method for domain adaptation. We empirically evaluate the effectiveness and efficiency of our approach. The experimental results show that our approach outperforms some baselines including four transfer learning algorithms.