Multi-task clustering via domain adaptation

Authors:
Zhihao Zhang;Jie Zhou
Affiliations:
Tsinghua National Laboratory for Information Science and Technology (TNList), State Key Laboratory on Intelligent Technology and Systems, Department of Automation, Tsinghua University, Beijing 100 ...;Tsinghua National Laboratory for Information Science and Technology (TNList), State Key Laboratory on Intelligent Technology and Systems, Department of Automation, Tsinghua University, Beijing 100 ...
Venue:
Pattern Recognition
Year:
2012

Citing 27
Cited 0

Multitask Learning

Machine Learning - Special issue on inductive transfer
Normalized Cuts and Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Co-clustering documents and words using bipartite spectral graph partitioning

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Information-theoretic co-clustering

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Convex Optimization

Convex Optimization
A probabilistic framework for semi-supervised clustering

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Learning and evaluating classifiers under sample selection bias

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Orthogonal nonnegative matrix t-factorizations for clustering

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data

The Journal of Machine Learning Research
Multi-Task Learning for Classification with Dirichlet Process Priors

The Journal of Machine Learning Research
Self-taught learning: transfer learning from unlabeled data

Proceedings of the 24th international conference on Machine learning
The class imbalance problem: A systematic study

Intelligent Data Analysis
Domain Adaptation of Conditional Probability Models Via Feature Subsetting

PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
Solving Consensus and Semi-supervised Clustering Problems Using Nonnegative Matrix Factorization

ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Semi-supervised graph clustering: a kernel approach

Machine Learning
Non-negative Matrix Factorization on Manifold

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
A convex formulation for learning shared structures from multiple tasks

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Extracting discriminative concepts for domain adaptation in text mining

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Co-clustering on manifolds

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Multi-task classification with infinite local experts

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Domain adaptation with structural correspondence learning

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Transfer learning via dimensionality reduction

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Domain adaptation for statistical classifiers

Journal of Artificial Intelligence Research
Learning the Shared Subspace for Multi-task Clustering and Transductive Transfer Classification

ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Convex and Semi-Nonnegative Matrix Factorizations

IEEE Transactions on Pattern Analysis and Machine Intelligence
Data clustering: 50 years beyond K-means

Pattern Recognition Letters
Multi-Task Learning for Analyzing and Sorting Large Databases of Sequential Data

IEEE Transactions on Signal Processing - Part II

Quantified Score

Hi-index	0.01

Visualization

Abstract

Clustering is a fundamental topic in pattern recognition and machine learning research. Traditional clustering methods deal with a single clustering task on a single data set. However, in many real applications, multiple similar clustering tasks are involved simultaneously, e.g., clustering clients of different shopping websites, in which data of different subjects are collected for each task. These tasks are cross-domains but closely related. It is proved that we can improve the individual performance of each clustering task by appropriately utilizing the underling relation. In this paper, we will propose a new approach, which performs multiple related clustering tasks simultaneously through domain adaptation. A shared subspace will be learned through domain adaptation, where the gap of distributions among tasks is reduced, and the shared knowledge will be transferred through all tasks by exploiting the strengthened relation in the learned subspace. Then the object is set as the best clustering in both the original and learned spaces. An alternating optimization method is introduced and its convergence is theoretically guaranteed. Experiments on both synthetic and real data sets demonstrate the effectiveness of the proposed approach.