A cross-collection mixture model for comparative text mining
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Boosting for transfer learning
Proceedings of the 24th international conference on Machine learning
Co-clustering based classification for out-of-domain documents
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
A two-stage approach to domain adaptation for statistical classifiers
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Topic-bridged PLSA for cross-domain text classification
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Knowledge transfer via multiple model local structure mapping
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Bridged Refinement for Transfer Learning
PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
Transfer learning from multiple source domains via consensus regularization
Proceedings of the 17th ACM conference on Information and knowledge management
Latent space domain transfer between high dimensional overlapping distributions
Proceedings of the 18th international conference on World wide web
EigenTransfer: a unified framework for transfer learning
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Heterogeneous source consensus learning via decision propagation and negotiation
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Probabilistic matrix tri-factorization
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Transfer learning via dimensionality reduction
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Learning the Shared Subspace for Multi-task Clustering and Transductive Transfer Classification
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Multi-domain learning by confidence-weighted parameter combination
Machine Learning
Probabilistic latent semantic analysis
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Compact coding for hyperplane classifiers in heterogeneous environment
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Multi-view learning via probabilistic latent semantic analysis
Information Sciences: an International Journal
Regularized nonnegative shared subspace learning
Data Mining and Knowledge Discovery
Triplex transfer learning: exploiting both shared and distinct concepts for text classification
Proceedings of the sixth ACM international conference on Web search and data mining
A partially supervised cross-collection topic model for cross-domain text classification
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Concept learning for cross-domain text classification: a general probabilistic framework
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Hi-index | 0.00 |
The distribution difference among multiple data domains has been considered for the cross-domain text classification problem. In this study, we show two new observations along this line. First, the data distribution difference may come from the fact that different domains use different key words to express the same concept. Second, the association between this conceptual feature and the document class may be stable across domains. These two issues are actually the distinction and commonality across data domains. Inspired by the above observations, we propose a generative statistical model, named Collaborative Dual-PLSA (CD-PLSA), to simultaneously capture both the domain distinction and commonality among multiple domains. Different from Probabilistic Latent Semantic Analysis (PLSA) with only one latent variable, the proposed model has two latent factors y and z, corresponding to word concept and document class respectively. The shared commonality intertwines with the distinctions over multiple domains, and is also used as the bridge for knowledge transformation. We exploit an Expectation Maximization (EM) algorithm to learn this model, and also propose its distributed version to handle the situation where the data domains are geographically separated from each other. Finally, we conduct extensive experiments over hundreds of classification tasks with multiple source domains and multiple target domains to validate the superiority of the proposed CD-PLSA model over existing state-of-the-art methods of supervised and transfer learning. In particular, we show that CD-PLSA is more tolerant of distribution differences.