Co-clustering based classification for out-of-domain documents

Authors:
Wenyuan Dai;Gui-Rong Xue;Qiang Yang;Yong Yu
Affiliations:
Shanghai Jiao Tong University;Shanghai Jiao Tong University;Hong Kong University of Science and Technology;Shanghai Jiao Tong University
Venue:
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2007

Citing 16
Cited 49

Elements of information theory

Elements of information theory
A training algorithm for optimal margin classifiers

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Representation and learning in information retrieval

Representation and learning in information retrieval
Multitask Learning

Machine Learning - Special issue on inductive transfer
Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Text Classification from Labeled and Unlabeled Documents using EM

Machine Learning - Special issue on information retrieval
Machine Learning

Machine Learning
Semi-supervised Clustering by Seeding

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Transductive Inference for Text Classification using Support Vector Machines

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Enhanced word clustering for hierarchical text classification

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
CBC: Clustering Based Text Classification Requiring Minimal Labeled Data

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Information-theoretic co-clustering

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Document clustering with prior knowledge

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Cross-domain knowledge transfer using structured representations

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Domain adaptation for statistical classifiers

Journal of Artificial Intelligence Research

Knowledge transfer via multiple model local structure mapping

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Spectral domain-transfer learning

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Transfer learning from multiple source domains via consensus regularization

Proceedings of the 17th ACM conference on Information and knowledge management
Cross-Domain Knowledge Transfer Using Semi-supervised Classification

AI '08 Proceedings of the 21st Australasian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
Maximum margin transfer learning

Proceedings of the first ACM/SIGEVO Summit on Genetic and Evolutionary Computation
Relaxed Transfer of Different Classes via Spectral Partition

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Graph-based transfer learning

Proceedings of the 18th ACM conference on Information and knowledge management
Large margin transductive transfer learning

Proceedings of the 18th ACM conference on Information and knowledge management
A risk minimization framework for domain adaptation

Proceedings of the 18th ACM conference on Information and knowledge management
Knowledge transfer on hybrid graph

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Co-training for cross-lingual sentiment classification

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Using Nearest Neighbor Information to Improve Cross-Language Text Classification

MICAI '09 Proceedings of the 8th Mexican International Conference on Artificial Intelligence
Transfer Learning beyond Text Classification

ACML '09 Proceedings of the 1st Asian Conference on Machine Learning: Advances in Machine Learning
Cross-domain sentiment classification via spectral feature alignment

Proceedings of the 19th international conference on World wide web
Knowledge transfer for cross domain learning to rank

Information Retrieval
Cross lingual adaptation: an experiment on sentiment classifications

ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Three challenges in data mining

Frontiers of Computer Science in China
Collaborative Dual-PLSA: mining distinction and commonality across multiple domains for text classification

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Exploiting word cluster information for unsupervised feature selection

PRICAI'10 Proceedings of the 11th Pacific Rim international conference on Trends in artificial intelligence
Using a new relational concept to improve the clustering performance of search engines

Information Processing and Management: an International Journal
Co-clustering sentences and terms for multi-document summarization

CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part II
Knowledge transfer based on feature representation mapping for text classification

Expert Systems with Applications: An International Journal
Transfer learning via multi-view principal component analysis

Journal of Computer Science and Technology - Special issue on natural language processing
Transfer learning through domain adaptation

ISNN'11 Proceedings of the 8th international conference on Advances in neural networks - Volume Part III
Localized factor models for multi-context recommendation

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Multi-view transfer learning with a large margin approach

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Topic graph based non-negative matrix factorization for transfer learning

ISMIS'11 Proceedings of the 19th international conference on Foundations of intelligent systems
Co-clustering with augmented data matrix

DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
Adaptive boosting for transfer learning using dynamic updates

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
Sentiment analysis with a multilingual pipeline

WISE'11 Proceedings of the 12th international conference on Web information system engineering
A cross-domain adaptation method for sentiment classification using probabilistic latent analysis

Proceedings of the 20th ACM international conference on Information and knowledge management
Bilingual co-training for sentiment classification of chinese product reviews

Computational Linguistics
Pairwise cross-domain factor model for heterogeneous transfer ranking

Proceedings of the fifth ACM international conference on Web search and data mining
Regression transfer learning based on principal curve

ISNN'10 Proceedings of the 7th international conference on Advances in Neural Networks - Volume Part I
A new domain adaptation method based on rules discovered from cross-domain features

KSEM'11 Proceedings of the 5th international conference on Knowledge Science, Engineering and Management
Research on text categorization based on a weakly-supervised transfer learning method

CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II
On minimum distribution discrepancy support vector machine for domain adaptation

Pattern Recognition
Cross-Guided Clustering: Transfer of Relevant Supervision across Tasks

ACM Transactions on Knowledge Discovery from Data (TKDD)
Content-based retrieval for heterogeneous domains: domain adaptation by relative aggregation points

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Information-theoretic multi-view domain adaptation

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
TALMUD: transfer learning for multiple domains

Proceedings of the 21st ACM international conference on Information and knowledge management
Over-Sampling from an auxiliary domain

ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part I
Triplex transfer learning: exploiting both shared and distinct concepts for text classification

Proceedings of the sixth ACM international conference on Web search and data mining
A Comparative Study of Cross-Lingual Sentiment Classification

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Entity-centric document filtering: boosting feature mapping through meta-features

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Co-clustering with augmented matrix

Applied Intelligence
Improving semi-supervised text classification by using wikipedia knowledge

WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
Domain adaptation with topical correspondence learning

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Concept learning for cross-domain text classification: a general probabilistic framework

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

In many real world applications, labeled data are in short supply. It often happens that obtaining labeled data in a new domain is expensive and time consuming, while there may be plenty of labeled data from a related but different domain. Traditional machine learning is not able to cope well with learning across different domains. In this paper, we address this problem for a text-mining task, where the labeled data are under one distribution in one domain known as in-domain data, while the unlabeled data are under a related but different domain known as out-of-domain data. Our general goal is to learn from the in-domain and apply the learned knowledge to out-of-domain. We propose a co-clustering based classification (CoCC) algorithm to tackle this problem. Co-clustering is used as a bridge to propagate the class structure and knowledge from the in-domain to the out-of-domain. We present theoretical and empirical analysis to show that our algorithm is able to produce high quality classification results, even when the distributions between the two data are different. The experimental results show that our algorithm greatly improves the classification performance over the traditional learning algorithms.