Bridging Domains Using World Wide Knowledge for Transfer Learning

Authors:
Evan Wei Xiang;Bin Cao;Derek Hao Hu;Qiang Yang
Affiliations:
Hong Kong University of Science and Technology, Hong Kong;Hong Kong University of Science and Technology, Hong Kong;Hong Kong University of Science and Technology, Hong Kong;Hong Kong University of Science and Technology, Hong Kong
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
2010

Citing 0
Cited 5

A new domain adaptation method based on rules discovered from cross-domain features

KSEM'11 Proceedings of the 5th international conference on Knowledge Science, Engineering and Management
On minimum distribution discrepancy support vector machine for domain adaptation

Pattern Recognition
Learning research in knowledge transfer

WISM'12 Proceedings of the 2012 international conference on Web Information Systems and Mining
A combined mining-based framework for predicting telecommunications customer payment behaviors

Expert Systems with Applications: An International Journal
Relational term-suggestion graphs incorporating multipartite concept and expertise networks

ACM Transactions on Intelligent Systems and Technology (TIST) - Special Section on Intelligent Mobile Knowledge Discovery and Management Systems and Special Issue on Social Web Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

A major problem of classification learning is the lack of ground-truth labeled data. It is usually expensive to label new data instances for training a model. To solve this problem, domain adaptation in transfer learning has been proposed to classify target domain data by using some other source domain data, even when the data may have different distributions. However, domain adaptation may not work well when the differences between the source and target domains are large. In this paper, we design a novel transfer learning approach, called BIG (Bridging Information Gap), to effectively extract useful knowledge in a worldwide knowledge base, which is then used to link the source and target domains for improving the classification performance. BIG works when the source and target domains share the same feature space but different underlying data distributions. Using the auxiliary source data, we can extract a “bridge” that allows cross-domain text classification problems to be solved using standard semisupervised learning algorithms. A major contribution of our work is that with BIG, a large amount of worldwide knowledge can be easily adapted and used for learning in the target domain. We conduct experiments on several real-world cross-domain text classification tasks and demonstrate that our proposed approach can outperform several existing domain adaptation approaches significantly.