Co-clustering sentences and terms for multi-document summarization
CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part II
High-order co-clustering text data on semantics-based representation model
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
Abstracting for Dimensionality Reduction in Text Classification
International Journal of Intelligent Systems
Enhancing short text clustering with small external repositories
AusDM '11 Proceedings of the Ninth Australasian Data Mining Conference - Volume 121
Learning person-specific models for facial expression and action unit recognition
Pattern Recognition Letters
Relational term-suggestion graphs incorporating multipartite concept and expertise networks
ACM Transactions on Intelligent Systems and Technology (TIST) - Special Section on Intelligent Mobile Knowledge Discovery and Management Systems and Special Issue on Social Web Mining
Transfer learning with one-class data
Pattern Recognition Letters
Hi-index | 0.00 |
Traditional approaches to document classification requires labeled data in order to construct reliable and accurate classifiers. Unfortunately, labeled data are seldom available, and often too expensive to obtain. Given a learning task for which training data are not available, abundant labeled data may exist for a different but related domain. One would like to use the related labeled data as auxiliary information to accomplish the classification task in the target domain. Recently, the paradigm of transfer learning has been introduced to enable effective learning strategies when auxiliary data obey a different probability distribution. A co-clustering based classification algorithm has been previously proposed to tackle cross-domain text classification. In this work, we extend the idea underlying this approach by making the latent semantic relationship between the two domains explicit. This goal is achieved with the use of Wikipedia. As a result, the pathway that allows to propagate labels between the two domains not only captures common words, but also semantic concepts based on the content of documents. We empirically demonstrate the efficacy of our semantic-based approach to cross-domain classification using a variety of real data.